Skip to main content

Posts

Showing posts with the label free

Ideogram.ai Great AI images for any project

I really like the tool ideogram for generating AI images. Although still far from perfect it is so much better at generating text in images than any other AI tool I have tried. For examples see some of the images I have produced below for my recently released Gen AI books.  

prefect ETL tool in python

Having spent a lot of my time playing with Keboola and dbt to load and transform my data I wanted to have a look at just doing stuff in pure python. I have previously built the fill ETL pipeline for a company in python but haven't really had a need to touch it in over 4 years. Most of the work I did before was just using pandas with a few connectors to various databases and producing reports in Excel using xlwings. It wasn't pretty but it was effective and everyone was happy with the job that it did.  Instead I ended up using the prefect library. Well I built it all and then integrated it into prefect once I found it. I found it ok and it has some useful features bit it is not brilliant but that could be through back of use. It does allow you to produce DAGs and and lots of other useful functionality. Script below. 

Loading my Strava Data using Python

I have wanted to load my strava data into my data platform since I started loading the strength data. I found some really useful instructions that I used as by base here . I basically use the procedures shown to load my last 200 strava activities. I load this into MySQL, find the new entries which then get loaded into the main MySQL table and then a bulk load into Snowflake. My next step will be to process this into a more meaningful table using either dbt or seeing if I can do something smart with python and a view in Snowflake.

AWS training cloud academy free course

One of the things I like about this course are the instructors are really clear but also that it provides free labs that allow you to actually sign into AWS and perform some actions to actually create and do things without worrying that you are going to incur a cost.  Today I complete one of the hands on labs.  This was to create a lambda function, in this case it was a very basic python script that was searching a website for a keyword. I then placed this into a schedule and used cloudwatch to create a dashboard that monitored the running of this function. Overall it was a very simple use case but it was also a very simple process to setup.  I don't have much to add to this other than it is well worth signing up to cloud academy for the free training if nothing else, I am tempted, once i have done some more training, to give the paid for option a go to get the full sandboxes. 

Pulling Data from Google Fit

So the next project for me will be to integrate step counts from Google Fit. Given my improved knowledge and understanding of the tools and infrastructure I am using can working out how I am going to do this and use this data before I start.  The first step was connecting to the Google Fit API and extraction the relevant data. I will admit that I did the standard developer trick and followed an online guide / stackoverflow to get this done, my main source was the link attached.  My next steps will be:  Use Keboola to connect and load the Google Sheet as in In Job As an Out job deposit the data into Snowflake.  Use dbt to transform and load the data into the final star schema.  At very least number of k of steps / day would be good to have on in my exercise fact tables. If I integrate this early enough and in enough places it will test a lot of by dbt understanding if nothing else.  As part of this I am going to also create a proper date dimension and integr...

dbt - more stuff

The more I used dbt the more I like it. I am finding many of its features really useful and I haven't even done the training on macros and packages yet so I feel there is more to come yet. In the meantime I have now start to, just of the fun of it, create some downstream views with dependencies on other steps and a function in SQL. Happy to say it is all working really well and using jinja (and my Snowflake function ) has saved me heap of time coding.  Sources yml:  View using the source function (results in SQL) View that references the output from previous steps, allows them to be linked:  Assuming you create your sources in the yml file and reference previous steps using the reference function rather than calling the resulting table (dbt handles that for you) (as shown above) it will automatically work out the dependencies, run things in the right order and produces a lovely lineage graph like so.  I am hoping to stop playing with what I know of dbt and might make...

DBT training

One of the tools I am hoping to get to grips with is DBT. It appears to be a very popular tool at the moment. I think with the trend of moving to ELT having a good tool to perform your transformations is important and from what I hear DBT is good.  I have signed up for the free DBT cloud developer account and connected it to my Snowflake instance but after that I am not quite sure what I am meant to be doing. DBT has its own training so I am starting with the dbt fundamentals course. The training is supposed to take several hours with a few more hours implementing the hands on project and gives up a badge for LinkedIn or something. I am more interested in trying out the tool and seeing what it can do, for free, for this project. I have looked into quite a few training courses over the last few months, looking at all the tools I am using for this and things like AWS and when it comes to actually being useful the dbt training is at the top so far. I skipped some as it was basic for s...

Zoho Analytics

Have I finally found my BI Tool, one that lets me import data from Snowflake and share it for free? I know, no sooner have I posted about how hard it was to find a tool that could do anything from Snowflake than I come across Zoho. You can check out my dashboard on the following page . Below is a diagram that outlines the processes I have used to obtain this data. In summary my parkrun e-mail is pushed to Google Sheets every week by Zapier and Forms I submit every day are used to track the strength training I do. Keboola is then used to ingest this data into MySQL and or Snowflake where I then use views or the built in transformation processes in Keboola to shift the data into a format for reporting. Google Data Studio then connects to MySQL and Zoho to Snowflake to visualise the data. 

Data Cleansing View in Snowflake

For part of one of my free ETLs I am using Zapps to transfer e-mails from Google Sheets and then Keboola to transfer the sheets into my Snowflake database. I am familiar with string searches and cleansing in Oracle and using python but have not had the chance to do this in Snowflake. I wanted to give it a go as a proof of concept if nothing else. There were some difference in functions between Oracle and Snowflake, no INSTR and using POSITION instead and some difference in working with dates / timestamps but overall it was very similar.  The code below is what I ended up using:  I think want to use this to create some overview graphics to allow me to track the success or failure of my ETLs. Assuming the aspects of Retool remain free you can see how much ETL is going on this link .  In case things aren't working, here is a table of the output I am producing. 

Google Data Studio - Part 1

As part of the project to work on a free data solution I have been looking into data visualisation tools. I have already done a post on Snowflake which has limited capabilities, I have also used Power BI however this has limited sharing options on the free plan. In my day jobs I have used various Oracle tools including OAC utilising the RPD and really badly I have used Excel.  Personally the best free tool, honestly probably the best tool other than it missing the data modelling layer, is Google Data Studio. I have found it fairly intuitive and for the most part easy to get the results I wanted.  Probably the bit I was most impressed with was the data visualisations stuff, see part 2 (on its way), however setting up supported data sources is also very easy. Note the at the moment Snowflake is not supported by Google itself.  To get going: Create a data source, this can use a Google Connector or one of many of their other customer produced connectors.   Once you ...

MySQL - Free

 So I was looking at trying to get a cloud based database that was always on. I wanted to build some visuals over whatever data I ended up building and have the DB accessible from a cloud server seemed like the easy way. I wanted to keep it free because I hate spending when I don't need to, so that others could use it for free and because I was sure there must be options out there. In the end my life was made much easier by spending £10 but you can go with the same free option on this site. https://www.freemysqlhosting.net/ Although not super fast or super sized it gives you a free and easily accessible database. So far I have easily connected using phpAdmin, BeeKeeper Studio, Python, Google Data Studio and Keboola. I have had no issues at all unlike several other solutions I have tried including Heroku.  To setup the DB you just set your location and hit start, you will then be e-mailed the connection details and then use your favourite MySQL IDE and you are in.  Above i...

Snowflake Data Visualisation

 Link to the visualisation here :  I have been looking for a tool to use to do free data visualisation on any data source but more recently on Snowflake. Whilst I can use Power BI but I cannot share. So I had a quick look at using the built in snowflake tool. It was very easy to create a couple of graphs off my basic model using a SQL query as the source but make no mistake it is, currently at least, nothing like the major reporting tools out there and cannot replace things like Google Data Studio, OAC or Power BI with very limited options available.