Skip to main content

Creating SCD2 tables in dbt

I don't want this blog to become the dbt blog so I have taken my time to post about this but do fully intend to do some more posts on dbt and the cool built in functionality. Equally I am quite happy with where my model is at at the moment so until I find some new tool to use or a new data source I am going to look to expand the section on the free training available. 

dbt has the ability to cater for creating scd(2) style tables called snapshots, details of it are included on the advanced materialization training. I set up my first snapshot model by creating an scd2 table for the activity type dim, so that if I get a new exercise type added it will create a new rows, equally if I delete or modify one of the old columns it will end date the old row and insert the new row. The preference is to do this off a date column for change time however I don't have this so do the merge against all columns. 

Snapshots sit in their own folder and have a fairly simple modelling structure shown in my example below. 



Comments

Popular posts from this blog

Gen AI News - 12/03/2024

Google’s Beta AI Content Rewriting Tool : Google is testing an AI tool that finds and rewrites quality content. However, some critics argue that it may incentivize the production of AI-generated low-quality content 1 . The New York Times and OpenAI Controversy : A court filing alleges that The New York Times paid someone to hack OpenAI’s products using deceptive prompts. The situation raises questions about the ethical use of AI 1 . Optimizing GPTs for Online Visibility : Learn how to increase online visibility and click-through rates for your GPT models in the GPT Store and Google Search with six practical tips 1 . AI Democratizing SEO or Amplifying Incompetence? : Understand what AI can realistically do for SEO and manage expectations regarding results 1 . Google’s “Help Me Write” AI Assistant : Google has launched an AI writing assistant called “Help Me Write” for the Chrome browser. It suggests text based on website context 1 . Google’s Gemini: Laptop-Friendly Open Language Model :...

My Latest project using Gen AI

So recently parkrun removed all their stats and as a keen running who is trying to work their way up the top 100 of their local parkrun I wanted to get some of these stats back and have a bit of "fun" at the same time. So here is a little "ETL" process that I developed with the help of Gen AI.  The steps of my ETL:  Copy and paste data into Google Sheets template where an AI produced formula extracts URLS from the text and puts them into a new field. This effectively allows me to extract the parkrun athlete id, the primary key, and use it in my analysis. I also have a column to autofill the data I am processing.  Use an Gen AI generated Google Apps script to process it into a processed sheet, this allows me to build up a backlog of events (I had over 500 to process).  This is then queried using a Gen AI Google sheets query to extract key information and columns / format times etc. I then ingest the fully processed sheet into Keboola directly from Google Sheets. ...

Gen AI News - 01/04/2024

According to Gemini here is the latest news:  Here's a rundown on some recent Generative AI news you might find interesting: Nvidia's New Architecture: Nvidia announced their next-gen "Blackwell" architecture for GPUs [1]. This promises a significant leap in performance for running large language models and other generative AI tasks. AI for Mental Health: An article highlighted how chatbots powered by generative AI are being used to help Gen Z with mental health struggles [2]. Amazon Invests in Anthropic: Amazon is making a further investment in Anthropic, an AI startup focused on developing safe and beneficial AI [2]. This suggests big players are keen on the potential of generative AI. Google and Reddit Partner Up: There have been talks about Google and Reddit forming a new partnership [4]. This could involve using generative AI for content creation or recommendation on Reddit. Overall, the field of Generative AI is seeing continued investment and de...