Course review: Snowflake - Data Warehousing Workshop
A SQL and Data warehouse course in Snowflake environment
In this course by Snowflake “Data Warehousing Workshop (Badge 1)” you will learn the basics of data warehousing in Snowflake environment.
Why Snowflake
Snowflake is a cloud based data platform provided as Software as a Service (SaaS) that is available in AWS, Azure and GCP. According to the Snowflake documentation their platform is “not built on any existing database technology or “big data” software platforms such as Hadoop. Instead, Snowflake combines a completely new SQL query engine with an innovative architecture natively designed for the cloud.”
About the course Data Warehousing Workshop
This course takes about 9 hours and teaches the basics of both Snowflake cloud environment and Data warehousing in general. The authors of the course clearly has some humor and they punish you for trying to guess instead of reading carefully by not letting you enter another answer for some time. That means, if you enter the wrong answer, you will have to wait a few minutes before you can continue. The course is free of charge but you pay by your time if you don’t take it seriously. In the end, there is a badge awaiting that you can publish on your LinkedIn profile to show the world that you passed this beginners course.
Course content
You will work hands on in the cloud doing a lot of copy-paste of SQL code but also editing code and working in Snowflake wizards for creating objects such as file formats, stages and sequences. The course consists of videos, text info, quizzes and labs where you need to show that you grasp the basic concepts of Snowflake and the definitions they use.
In the end of this course, you learn how to load and query semi-nested data in XML and JSON-formats by looking at Twitter data.
What I can do now after this course
After this course I feel that I would be able to start a project where I read lots of XML or JSON files from AWS/Azure/GCP into Snowflake and create normalized views of the data. Snowflake has a unique data type called VARIANT that is great for semi-structured data. You can read more about the VARIANT data type here.
What I have not learned yet is how to automated a project like the above and how to check for new files dynamically but this course is at least a good starting point. I liked it and I might consider going deeper into Snowflake because the tech seems very good from what I’ve understood so far.