Advanced techniques for beginners
14 hours ago
In this talk, I want to discuss how to transform data. You perform data transformations based on data models – databases, data warehouses, reporting solutions – but how do you configure them? I’d like to talk to you about the latest data conversion tools you use. Let’s cover some of the nuances of testing modular approaches, scheduling, and data transformation. In the final part of this article, I will provide an example application that executes data modeling tasks using data lineage and self-documentation features. I’d love to know what you think about it.
I’ve seen dozens of different ways to perform data transformations. With over 15 years of big data and analytics experience, I’ve built data pipelines with a variety of design patterns, and I’m sure there are many more. That’s why I love the world of technology so much. The sheer number of possibilities it offers is truly amazing.
What operating system do you use for your data warehouse?
Modern data conversion tools
Modern data transformation tools, also known as data modeling tools or data warehouse (DWH) operating systems, are designed to simplify SQL data manipulation tasks to create data sets, views, and tables. They often use a SQL-like dialect to run any possible data definition (DDL) and manipulation (DML) you may need, including testing data transformations and creating custom data sets in development mode.
These tools are very useful because there are an abundance of ANSI-SQL data warehouse solutions on the market. For example, take a look at the list of dbt adapters below. All market leaders attend there.
DBT It stands for Database Building Tool and is basically a scheduler application that can be run locally or on a server to run data transformation tasks. For example, let’s look at the simple model below. It creates a view in the database and can materialize the view every five minutes to preserve data for analysis. At the top of the file we have…