Skip to main content

Creating a date dimension in dbt and Snowflake

A lot of the data that I am working with doesn't lend itself to creating complex star schemas (more on data modelling in a later post) however I want to try and at least want to go to some effort. The one thing that pretty much all my tables have in common is date, so I want to build a date dimension. My strength training happens on a day, my running and walking, parkrun and hopefully the weather data I intend to integrate will have a date on it. This means if I have a date dimension and I aggregate my various facts at date level I can use them together in a single visualisation. 

There are various methods I could have used to create the date dimension and I went with using the dbt spine, this was really simple using the utilities: 


Used the date functions in Snowflake to extract relevant fields such as year, month, day in words and year month etc. 

Comments

Popular posts from this blog

Gen AI News - 12/03/2024

Google’s Beta AI Content Rewriting Tool : Google is testing an AI tool that finds and rewrites quality content. However, some critics argue that it may incentivize the production of AI-generated low-quality content 1 . The New York Times and OpenAI Controversy : A court filing alleges that The New York Times paid someone to hack OpenAI’s products using deceptive prompts. The situation raises questions about the ethical use of AI 1 . Optimizing GPTs for Online Visibility : Learn how to increase online visibility and click-through rates for your GPT models in the GPT Store and Google Search with six practical tips 1 . AI Democratizing SEO or Amplifying Incompetence? : Understand what AI can realistically do for SEO and manage expectations regarding results 1 . Google’s “Help Me Write” AI Assistant : Google has launched an AI writing assistant called “Help Me Write” for the Chrome browser. It suggests text based on website context 1 . Google’s Gemini: Laptop-Friendly Open Language Model :...

Gen AI News - 01/04/2024

According to Gemini here is the latest news:  Here's a rundown on some recent Generative AI news you might find interesting: Nvidia's New Architecture: Nvidia announced their next-gen "Blackwell" architecture for GPUs [1]. This promises a significant leap in performance for running large language models and other generative AI tasks. AI for Mental Health: An article highlighted how chatbots powered by generative AI are being used to help Gen Z with mental health struggles [2]. Amazon Invests in Anthropic: Amazon is making a further investment in Anthropic, an AI startup focused on developing safe and beneficial AI [2]. This suggests big players are keen on the potential of generative AI. Google and Reddit Partner Up: There have been talks about Google and Reddit forming a new partnership [4]. This could involve using generative AI for content creation or recommendation on Reddit. Overall, the field of Generative AI is seeing continued investment and de...

My Latest project using Gen AI

So recently parkrun removed all their stats and as a keen running who is trying to work their way up the top 100 of their local parkrun I wanted to get some of these stats back and have a bit of "fun" at the same time. So here is a little "ETL" process that I developed with the help of Gen AI.  The steps of my ETL:  Copy and paste data into Google Sheets template where an AI produced formula extracts URLS from the text and puts them into a new field. This effectively allows me to extract the parkrun athlete id, the primary key, and use it in my analysis. I also have a column to autofill the data I am processing.  Use an Gen AI generated Google Apps script to process it into a processed sheet, this allows me to build up a backlog of events (I had over 500 to process).  This is then queried using a Gen AI Google sheets query to extract key information and columns / format times etc. I then ingest the fully processed sheet into Keboola directly from Google Sheets. ...