Skip to main content

Keboola Flows

Really finding Keboola was the thing that kickstarted this project otherwise I would be trying to build custom code on a python cloud server and building everything from scratch. 

In Keboola you build you data sources and destinations using connection details which is fairly simple and something I will likely cover in another post, same goes for transformations etc. Here though I am going to discuss Flows, this is where you bring everything together. On my free account there are some limitations. 

My easiest flow is very basic: 

  • Pull parkrun results e-mail from Gmail to Google Sheets (actually done by Zap not Keboola). 
  • Keboola will, as often as I like, in this case once a week, pull the data from the sheet into its storage. 
  • It will then transfer this to the target database. Currently I have this setup to be MySQL database but I can and might expand that to the Snowflake instance within Keboola. 
  • I then, outside of Keboola, connect to the MySQL database from Google Data Studio and make some visualisations. 
Within Keboola flows you have several tabs. The builder tab where you configure your flow. The All Runs tab to look at the logs of Flow Runs, Notifications where you can configure e-mail notifications for various results of the Flow (Success, Error etc.) and the versions tab where you can look at the history of the flow. 

You can see many of these steps and the basic config in the below gif. 


The end result of this flow is the e-mail date, subject and body being passed into the MySQL database. I then do some data cleansing on this (post to come) and then visualise (currently poorly) in Google Data Studio. I intend to do another post on comparing this very simple data set by visualising in GDS, Power BI and Retool. 


Comments

Popular posts from this blog

My Latest project using Gen AI

So recently parkrun removed all their stats and as a keen running who is trying to work their way up the top 100 of their local parkrun I wanted to get some of these stats back and have a bit of "fun" at the same time. So here is a little "ETL" process that I developed with the help of Gen AI.  The steps of my ETL:  Copy and paste data into Google Sheets template where an AI produced formula extracts URLS from the text and puts them into a new field. This effectively allows me to extract the parkrun athlete id, the primary key, and use it in my analysis. I also have a column to autofill the data I am processing.  Use an Gen AI generated Google Apps script to process it into a processed sheet, this allows me to build up a backlog of events (I had over 500 to process).  This is then queried using a Gen AI Google sheets query to extract key information and columns / format times etc. I then ingest the fully processed sheet into Keboola directly from Google Sheets. ...

Gen AI News - 12/03/2024

Google’s Beta AI Content Rewriting Tool : Google is testing an AI tool that finds and rewrites quality content. However, some critics argue that it may incentivize the production of AI-generated low-quality content 1 . The New York Times and OpenAI Controversy : A court filing alleges that The New York Times paid someone to hack OpenAI’s products using deceptive prompts. The situation raises questions about the ethical use of AI 1 . Optimizing GPTs for Online Visibility : Learn how to increase online visibility and click-through rates for your GPT models in the GPT Store and Google Search with six practical tips 1 . AI Democratizing SEO or Amplifying Incompetence? : Understand what AI can realistically do for SEO and manage expectations regarding results 1 . Google’s “Help Me Write” AI Assistant : Google has launched an AI writing assistant called “Help Me Write” for the Chrome browser. It suggests text based on website context 1 . Google’s Gemini: Laptop-Friendly Open Language Model :...

Zapier

As much as I have enjoyed using Keboola there are some connections that it doesn't have or that just haven't worked for one reason or another. I actually came across Zapier as a solution for bringing in e-mails from parkrun to load my results every week. Honestly I have not found it to be as robust as Keboola but that might just be me archiving my e-mails before it completes it 15 minute poll.  The second use case I am working on is the pulling in Strava data, for a fitness dashboard the fact it has a built in connector for Strava is great, though I am worried given the activities I do that I might reach the limit.  I won't go into details on how to set things up but you can setup 5 Zaps that can run for a combined 100 runs during a month for free.  In my data platform / solution I am using Zaps to load harder to get / automate data. It doesn't add much from a technical point of view as it is just signing into a few account to get the data into Google Sheets for downstr...