Modern Analytics With DBT

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11
At a glance
Powered by AI
The key takeaways are that dbt enables analysts to work more like software engineers by bringing principles of software engineering to SQL and transforming data in warehouses. It also takes time to change mindsets to this new way of working.

Dbt is trying to solve problems analysts face like needing to modify past analyses, experiencing incorrect data from duplicate records when joining tables, finding others' queries hard to read, and manually modifying queries when data changes. It aims to make the transform process more standardized and reusable.

Some benefits of dbt highlighted are version control of queries, using Jinja templating for more effective SQL writing, built-in unit testing capabilities, automatic dependency graph generation for documentation, and reuse of SQL between models with macros.

MODERN

ANALYTICS
WITH
DATA
BUILD TOOL

Hermes Squad - March 19, 2021


OUR TEAM

DS Nghia Trinh PM Linh K Do DA Thuy Tran DA Phuong Hoang

DA Bao Truong DA Rachel Nguyen DA Kieu Giang DA Hai Tran


PROBLEM
Have you ever:
● Been asked to modify an analysis that you have done a couple of weeks ago?
● Experienced data incorrectness cause you accidentally duplicate records while
joining tables?
● Been frustrated when reading others’ queries?
● Wished you can write for-loop for repeated SQL
● Manually modify your queries as DE changes table names
DBT INTRODUCTION
● dbt (data build tool) is a command-line tool. It enables analytics engineers (and data
analyst) to transform data in their warehouses by simply writing select statements.
● dbt enables analysts to work more like software engineers. dbt brings software engineering
principles to the SQL world. For example, select statements can be versioned and reused;
schema and data can be tested which would be similar to conducting unit tests.
● dbt does the T in ELT (Extract, Load, Transform) processes. It doesn’t extract or load data,
but it’s extremely good at transforming data that’s already loaded into your warehouse.
● dbt is a compiler and a runner. dbt’s only function is to take code, compile it to SQL, and then
run against your database. dbt supports a majority of the most common databases, such as
Postgres, BigQuery, Snowflake and Redshift.
● Fishtown analytics recently raised $29.5M Series B for dbt last year.
BENEFITS OF DBT
1. Version control
It’s critical that analysts version control their queries since (1) they will always need it again
(2) Queries are living artifacts that change over time, and (3) SQL queries contain built up domain
knowledge so sharing queries with others is a good way to spread institutional knowledge about
our data.
2. Take the power of Jinja to write SQL more effectively
Jinja is a templating language that give you the ability to do things that aren’t normally
possible in SQL, such as for-loop, using macro (function), using environment variables, etc.
BENEFITS OF DBT (cont.)
3. Unit testing (predefined and user-defined test)
Testing is super important to ensure the quality of your analysis. Dbt offers a very convenient
built-in testing functionality, so that you could check for null, unique value or define a test case to
test your business logic.
4. Automatic dependency graph generation
Dbt enables you to generate documentation about your project so anyone looking in can
quickly understand the model and their dependencies.
5. You can re-use SQL between models
Macros in dbt are snippets of SQL that can be invoked like functions from models. This
makes it possible to re-use SQL between models in keeping with the engineering principle of DRY
(Don’t Repeat Yourself).
DEMO

Business problem
Ms. Lam would like to understand our MAF customer’s
behaviour including merchant categories shopping
frequency and payment method on the online and offline
channels, so that she could calculate opportunity sizing for
a new product.
DATA MODEL AND LINEAGE GRAPH
TAKEAWAY

Asking analysts to work Don’t just consider it as a Dbt enables analysts to be


like software engineers is tool, it’s a change in the more involved in the
difficult in the first place, it mindset and way of process of preparing the
takes time to familiarize. working. data (into data marts).
FUTURE

TEAM TRAINING

GT MONTHLY REPORT

SELF-SERVICE
THANKS!
Q&A

You might also like