Skip to main content

dbt - pivoting on list of values

Say you have a massive table but for ease of use for reporting you need to split numbers out. For example say you are a service that provides numerous different services, each person gets a single form with a mark for each service provided and the transactional system stores this a a row per form per service. For reporting you may well want to pivot these out and have a separate columns for each service with a 1 / 0 and a count for each. 

In traditional databases and tools you have to create all your views in advanced and you cannot pivot on a list of values. The joy I am finding with dbt and jinja coding is that you can create the views dynamically. And whilst that may pose some risk if means that in this scenario a new service code is catered for automatically as you can dynamically generate the column by looping through a list of values. 

I have applied this logic my strength exercise data, pivoting it from rows to columns. Below is the jinja code and the resulting SQL, 

The other thing I used here was from the utils was slugify utility to replace spaces with underscores in my strings so I can use them in column names.

Comments

Popular posts from this blog

Gen AI News - 12/03/2024

Google’s Beta AI Content Rewriting Tool : Google is testing an AI tool that finds and rewrites quality content. However, some critics argue that it may incentivize the production of AI-generated low-quality content 1 . The New York Times and OpenAI Controversy : A court filing alleges that The New York Times paid someone to hack OpenAI’s products using deceptive prompts. The situation raises questions about the ethical use of AI 1 . Optimizing GPTs for Online Visibility : Learn how to increase online visibility and click-through rates for your GPT models in the GPT Store and Google Search with six practical tips 1 . AI Democratizing SEO or Amplifying Incompetence? : Understand what AI can realistically do for SEO and manage expectations regarding results 1 . Google’s “Help Me Write” AI Assistant : Google has launched an AI writing assistant called “Help Me Write” for the Chrome browser. It suggests text based on website context 1 . Google’s Gemini: Laptop-Friendly Open Language Model :...

My Latest project using Gen AI

So recently parkrun removed all their stats and as a keen running who is trying to work their way up the top 100 of their local parkrun I wanted to get some of these stats back and have a bit of "fun" at the same time. So here is a little "ETL" process that I developed with the help of Gen AI.  The steps of my ETL:  Copy and paste data into Google Sheets template where an AI produced formula extracts URLS from the text and puts them into a new field. This effectively allows me to extract the parkrun athlete id, the primary key, and use it in my analysis. I also have a column to autofill the data I am processing.  Use an Gen AI generated Google Apps script to process it into a processed sheet, this allows me to build up a backlog of events (I had over 500 to process).  This is then queried using a Gen AI Google sheets query to extract key information and columns / format times etc. I then ingest the fully processed sheet into Keboola directly from Google Sheets. ...

Zapier

As much as I have enjoyed using Keboola there are some connections that it doesn't have or that just haven't worked for one reason or another. I actually came across Zapier as a solution for bringing in e-mails from parkrun to load my results every week. Honestly I have not found it to be as robust as Keboola but that might just be me archiving my e-mails before it completes it 15 minute poll.  The second use case I am working on is the pulling in Strava data, for a fitness dashboard the fact it has a built in connector for Strava is great, though I am worried given the activities I do that I might reach the limit.  I won't go into details on how to set things up but you can setup 5 Zaps that can run for a combined 100 runs during a month for free.  In my data platform / solution I am using Zaps to load harder to get / automate data. It doesn't add much from a technical point of view as it is just signing into a few account to get the data into Google Sheets for downstr...