So I have loaded the scripts from every episode of Friends into a Snowflake database and I aim to analyse the words used and create a dashboard that anyone can have a play with by creating a roll up table with various statistics. For now that is a work in progress, so instead I am also going to use it in an episode by episode Friends trivia video series.
So recently parkrun removed all their stats and as a keen running who is trying to work their way up the top 100 of their local parkrun I wanted to get some of these stats back and have a bit of "fun" at the same time. So here is a little "ETL" process that I developed with the help of Gen AI. The steps of my ETL: Copy and paste data into Google Sheets template where an AI produced formula extracts URLS from the text and puts them into a new field. This effectively allows me to extract the parkrun athlete id, the primary key, and use it in my analysis. I also have a column to autofill the data I am processing. Use an Gen AI generated Google Apps script to process it into a processed sheet, this allows me to build up a backlog of events (I had over 500 to process). This is then queried using a Gen AI Google sheets query to extract key information and columns / format times etc. I then ingest the fully processed sheet into Keboola directly from Google Sheets. ...
Comments
Post a Comment