Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
46 views21 pages

DBT Fundamentals

Uploaded by

zhangzerox01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views21 pages

DBT Fundamentals

Uploaded by

zhangzerox01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

DBT FUNDAMENTALS

GLOSSARY

INTRODUCTION TO DBT ................................................................................................................................ 1

PROJECT STRUCTURE ..................................................................................................................................... 2

SOURCES ................................................................................................................................................... 2

SEEDS ........................................................................................................................................................... 3

SNAPSHOTS ............................................................................................................................................ 3

MODELS (INTRODUCTION) ............................................................................................................ 4

STAGING & INTERMEDIATE MODELS ...................................................................................... 5

INCREMENTAL MODELS ................................................................................................................... 6

JINJA, MACROS & VARIABLES ....................................................................................................... 7

TESTS ............................................................................................................................................................ 8

NON ADVANCED & ADVANCED PIPELINES ....................................................................................... 9

RUNNING DBT ON APACHE AIRFLOW ................................................................................................ 11

DBT CHEAT SHEET ......................................................................................................................................... 13


check the
DBT | Introduction to DBT LinkedIn post!

ETL ELT

Extract | Load
Transform Load Transform
Develop
Test/ Document
Deploy
Data transformation prior to loading it (resource-efficient for Loads raw data first, enabling multiple transformation versions
the destination system) from the same source
Requires transformation servers & middleware infrastructure. Leverages modern DWH computing power for transformations
Suited for legacy systems and smaller data volumes Scales with big data (efficient larger volumes handling)
Pre-validated, clean data (at time of loading), less Raw data kept alongside transformed views (data integrity).
downstream errors . More cost-effective (eliminates need for separate
Limited flexibility transformation infrastructure).

DBT
Issues DBT solves: Data Warehouses
Lack of testing/ documentation Data BI
Transformed
Rewriting stored procedures code Raw data
Loaders data Tools
Hard to understand transformation code

Dbt enables SQL-based working on data pipelines


using software engineering best practices staging
salesforce_src
dbt is SQL first (compiles SQL code & sends it to
your DWH to run it)
google_a_src
data democratization
Software Engineering best practices (testing, sql_server_src

version control, documentation)


postgres_src
Data lineage/ dependency management
Does not store data, has no compute power csv_src
Dbt code stored in git provider (for versioning)
json_src
2 dbt products
dbt cloud kafka_events_sr
dbt-core
c

open source cloud platform, own IDE webhook_clients


_src
developing, testing, runs dbt-core
documentation complex features (CI/CD,
dbt transforms raw data from your DWH into analytics-ready datasets by executing
python packages RBAC, environments,
SQL transformations in a specific order, creating a modular and version-controlled
(interact w/ CI) notifications,...)
transformation layer between your raw data and BI tools.

Curated by Albert Campillo -1-


check the
DBT | Project Structure - Sources LinkedIn post!

DBT Project Structure


dbt_project (simple example)
Sources Snapshots Staging Marts /seeds/ country_codes.csv
/snapshots
Orders Stg_Orders Fact_Orders /models
/marts
Payments Stg_Payments fact_orders.sql
fact_orders.yml
/staging
Customers Customers_scd Stg_Customers Dim_Customers _sources.yml
stg_orders.sql
Seeds stg_orders.yml
Country Codes Tests stg_payments.sql
stg_payments.yml
Some Some stg_customers.sql
stg_customers.yml
Generic_Test Singular_Test
/macros
/target
Sources: raw data tables in your warehouse that dbt references as input data. /tests
Seeds: static CSV files you load into your warehouse through dbt, typically used for small lookup/reference tables.
Snapshots: special models; track historical changes in your source data by implementing slowly changing dimensions Type 2.
Staging Models: first layer of transformation; cleans & standardizes source data in a one-to-one relationship with source tables.
Marts: business-oriented models that combine & transform staging models into analytics-ready datasets.
Tests: assertions you write to validate your data models, ensuring referential integrity, uniqueness, null checks & business logic.

Sources (raw data tables, referenced in dbt as input data)


Names & describes the data loaded into your warehouse by extract & load tools
Declare sources in .yml file under sources key.
Use {{source()}} function to select a source referenced from a model.
dbt compiles it to the full table name
dbt creates a dependency between model & source table

source model compiled


to full source table

Add data tests to sources

Curated by Albert Campillo -2-


check the
DBT | Project Structure - Seeds & Snapshots LinkedIn post!

Seeds (csv files in your dbt project)


Seeds are version controlled & code reviewable; best suited for static
data that changes infrequently dbt seed
seeds stored in seeds folder as .csv files
dbt seed creates a new table inside DWH schema
seeds are referenced in models using {{ ref() }} function

a seeds csv file & how seed is referenced in a model

Snapshots (implement Type 2 SCD)


Snapshots implement Type 2 Slowly Changing Dimensions (table row changes over time) over mutable source tables.
Stored as .sql files in ‘snapshots’ folder; referenced in downstream models using {{ ref() }} function

How dbt creates snapshots (in a nutshell)


‘dbt_valid_from’
Snapshot table
‘dbt_valid_to’ = ‘null’ (if not specified in ‘valid_to_current’ config)
Yes created

updates exist records


N
‘dbt_valid_to’ updated

1st time New


dbt snapshot
run? records?
Y
adds new records
Y updates exist records
Existing
No
records ‘dbt_valid_to’ updated for exist changed
updates? ‘dbt_valid_to’ = ‘null’ for new

N
N New
records?
Y adds new records

‘dbt_valid_to’ = ‘null’

Curated by Albert Campillo -3-


check the
DBT | Project Structure - Models (1/2) LinkedIn post!

Models (Staging, Intermediate, Marts)


Models Best Practices
STAGING MODELS INTERMEDIATE MODELS MARTS
should... should... should...
Have a 1:1 relationship with its source Be modulable, built as reusable Represent business concepts/ entities
table (keeps clean lineage, easier to building blocks referenced by as defined by stakeholders (ie. orders,
troubleshoot). multiple downstream models. customers, products)
Be separared by source (Salesforce, Break down complexity into smaller, Follow dimensional modeling
Stripe, GA,...), organized in separate more manageable models principles: fact tables (business
directories (sets boundaries, Be grouped by area of concern (ie. metrics, events); dimension tables
ownership). orders, inventory, customers,...) (descriptive attributes, context to
Hold light transformations only (ie. Follow a consistent naming facts)
column rename, standardization). convention: int_orders_pivoted.sql Be organized by business domain/
Respect consistent naming functional area (ie. finance, sales,
convention stg_[source]_[entity]s.sql marketing,...)

stg_stripe_payments int_payments_pivoted_to_orders payments

/models/staging/stripe /models/intermediate/finance /models/marts/finance

Problems/ Challenges
Streaming data. Dbt primarily designed for batch processing (no native support for real-time data transformations).
Best practices are hard to implement in dbt-core (limited vs dbt cloud).
Dbt overreliance may lead to bad practices (ie. creating too complex transformations, neglecting database-level optimizations,
building too many intermediate tables,...).
Hard to deal with custom DDL (ie. specific table properties, custom storage patterns, handling complex partitioning strategies,
custom indexes or materialized views with specific configurations).

SQLFluff
SQLFluff: SQL linting tool that helps maintain consistent and high-quality SQL code in your data models
SQL code rule checking: analyzes your SQL code against a predefined set of rules and best practices, such as:
Keywords capitalization; proper indentation & formatting; appropriate spacing around operators; naming conventions; code
structure/ organization; query complexity & performance considerations.
Code fixing capabilities: automatically fixes many common problems in your SQL code:
Reformat code to match style guidelines; fix indentation issues; standardize capitalization; correct spacing; restructure
queries for better readability.

Curated by Albert Campillo -4-


check the
DBT | Project Structure - Models (2/2) LinkedIn post!

Models (Staging, Intermediate, Marts) - continued

Dbt Model: chunks of code materialized as an object in your DWH, primarily


written as SELECT statement & saved as SQL or python file.
Model folder:
Project file (.yml): tells dbt the project context
Models (.sql or .py): lets dbt know how to build a specific data set.
dbt run builds the model* data warehouse by wrapping it in a
create view as or create table as statement**.
Model configuration
To be set in dbt_project.yml within a config block.
Model materialization (SQL used to create it).
Build models into separate schemas.
Apply tags to models.

Model dependencies: use {{ ref() }} function

View Table Incremental


DBT Materializations
View: virtual table; stores the query logic only & is rebuilt on Build time
every run (minimal build costs).
Table: persistent physical table; fully rebuilds all data on Build costs
every run; fastest query performance.
Incremental: hybrid approach; processes only new/changed Query costs
records since last run; balances performance & cost.
Freshness
Ephemeral: temporary materialization; exists only during
model compilation (no physical object in DWH). Complexity

Fast, low, simple | medium, moderate | slowest, high

Curated by Albert Campillo -5-


check the
DBT | Project Structure - Incremental Models LinkedIn post!

Models (Incremental)

Incremental models apply transformations to rows of data with new/


How incremental models work
updated info (maximize efficiency, avoid unnecessary computation costs).
Requirements: existing existing
“full
a filter to select new/ updated records. source
model model
refresh”
a conditional block wrapping the filter and when to apply it. {{this}} {{this}}

a configuration stating the incremental approach.


a timestamp variable stating when a record is updated ‘updated_at’
new new
a cutt-off timestamp: the most recent timestamp from the table in records records
our DWH.

Anatomy of an Incremental Model

{{ config() }} configuration settings for an incremental model


materialized: incremental
incremental_strategy: states how to handle new or updated data in the
existing dwh model .
‘merge’: updates existing records and inserts new ones
‘append’: only adds new records to the end without touching existing data
‘delete+insert‘: removes and reloads all data for specified partition/subset
‘microbatch’: processes data in small chunks for efficiency
unique_key: column used to apply the model incremental strategy.
on_schema_change: handles changes in the model schema.
‘ignore’: default behavior
‘fail’: triggers an error message when source & target schemas diverge
‘append_new_columns’: appends new columns to existing table (it does
not remove columns from existing table not present in the new data).
‘sync_all_columns’: adds new columns to existing table & removes
columns now missing

{{ % if % }} ... {{ % endif % }} a jinja statement wrapping the incremental logic {{ this }} variable used to self-
(where clause) with updated_at cutoff in a conditional statement reference the model & compiles
where the code is running (model as
is_incremental() a built-in dbt function that executes incremental model if 3 it exists in the dwh)
conditions are met:
1. materialized = ‘incremental’.
2. a table exists for this model in our DWH.
3. --full-refresh flag is not passed (a full refresh overrides the incremental
materialization and builds a table from scratch again).

Curated by Albert Campillo -6-


check the
DBT | Project Structure - Jinja, Macros & Variables LinkedIn post!

Jinja
Jinja, a python templating library that extends dbt’s SQL capabilities to:
Control structures (ie. ‘if’ statements, ‘for’ loops)
Leverage variables and/or results of a query into another query
Abstract snippets of SQL into reusable macros
Jinja leverages delimiters
Expressions {{ ... }} : used to reference variables and/or call macros. Output a string.
Statements {% ... %} : used for control flow (ie. for loops, if statements, set/ modify variables). Not a string output.
Comments {# ... #}
Functions {{ ref() }} and {{ source() }} for lineage & dependency management
dbt commands
dbt compile compiles model to SQL and stores it into target folder

Macros
Macros are reusable pieces of code (aka. dbt functions), defined in .sql files inside the ‘macros’ folder. How to use macros?
Write your own macro Open-sourced macros (dbt-utils library installed),
callable inside the sql statement

calls current_timestamp() function in dbt_utils library

dbt-utils macros help tackle tasks like schema testing, data


validation, utility functions for string operations, date
manipulations, generating surrogate keys or common SQL
operations (pivoting/unpivoting, date generation,...)

Variables

Variables are defined inside the dbt_project.yml file, and can be scoped
global var
globally or to a specific package imported
specific var to my_dbt_project
vars can be overwriten in command line following the pattern
dbt run --vars ‘{ ”key” : “value” }
vars can be accessed in a sql statement passing the var() function

Curated by Albert Campillo -7-


check the
DBT | Project Structure - Tests LinkedIn post!

Tests (Data & Unit)


Can be reused over & over, forked for any model. Defined in a
generic
test block with a parametrized query including arguments
data tests Test your data
Run after the model is
materialized
Defined in SQL queries
not_null generic test in sql file
Support both SQL & Python
test properties added in .yml file
models Note. dbt has 4 defined generic
Two types: generic or tests: ‘unique’, ‘not_null’,
singular. ‘accepted_values’, ‘relationships’
singular One-off assertions for single purpose & model. Defined in .sql

dbt tests
states singular sql test in tests’ schema.yml file
adds test in a sql statement

unit tests Test your SQL logic


Run before the model is
materialized
Defined in .yml files.
Suppport SQL models only
Input & expected rows
writen in .sql,.csv, dict

model calculates calculates the validity of an email adds a test in model yml file to validate email logic

Dbt Tests’ best practices - what type of tests and where in the dbt project?

Data Tests Where/ when to apply in a dbt project?


Generic
‘unique’, ‘not_null’, ‘accepted_values’ Sources Staging Intermediate Marts
sets & ranges, table shape (row count)
dbt_expectations & dbt_utils packages
Singular
Business logic
Unit Tests
Source freshness Clean up nulls, ‘unique’ tests Unit tests
Complex joins/ filters, regex functions
‘unique’ tests dupes/ outliers ‘not_null’ tests Singular tests
Incremental ‘not_null’ tests Accepted range (new/ grain-protecting Mutually exclusive
Window functions Sequential values columns only) ranges
Business logic

Curated by Albert Campillo -8-


check the
DBT | Non-Advanced & Advanced Pipelines (1/2) LinkedIn post!

Non-Advanced dbt pipelines


Regular refresh : runs & tests your entire model (dbt buid *)
Partial Regular Refresh: runs & tests part of your dbt project
dbt build -s <some selection method>
dbt build -s ‘tag:my_tag’ ** mysource
dbt build -s ‘source:mysource+’ ***
my_tag
dbt build -s +fct_orders **** fct_orders
...
+ Find more about different selection methods available in dbt

Advanced dbt pipelines | 3 approaches


(a) Source Freshness runs sources no new data passes through
Stale sources: no new data | Fresh sources: new data stale

Check & run models that depend on fresh sources only fresh

dbt source freshness (checks models that have fresh data)


dbt build -s source_status:fresher+ (runs models that
stale
depend on fresh source data)
fresh
Only runs models that have new data (saves compute power)

(b) WAP (Write-Audit-Publish)


Ensures thorough review and complance of changes.
Reduces the risk of errors in critical systems.
Provides a clear audit trail for changes, useful for regulatory and compliance purposes.
dbt commands:
Write: dbt run -s audit_table
Audit: dbt test -s audit_table
Publish: dbt run -s production_table
Exchange staging & Downstream
production partitions pipelines
Yes
writes to Quality
Pipeline Production Table
Checks Pass? No production_table
Staging Table
No Manually
audit_table Blocking Yes
Fire alert troubleshoot
check?
DQ issues

Curated by Albert Campillo -9-


check the
DBT | Non-Advanced & Advanced Pipelines (2/2) LinkedIn post!

(c) CI/CD (Continuous Integration/ Continuous Deployment)


You can configure github actions to build your project & see if everything is fine
CI: build modified models in a test schema when a PR is created.
CD: build modified models in prod schema when code is merged to main.
dbt commands:
dbt build -s ‘state:modified+’ --defer /path/to/artifacts (runs the modified model & everything downstream)

dbt compares the modifications with artifacts (captured in target/manifest.json)


No
Make
changes
CI/CD Workflow
No
Write a
Branch from Develop dbt PR CI build CI build
Add tests deployment Open PR
master models starts successful?
script
branch from create dev env add necessary add script open PR in draft CI build runs
main & develop as tests for new/ including dbt mode to add more models & tests
usual updated models commands & changes first. Only based on Yes
tests. when changes are deployment
ready, make PR scripts provided
ready for review. (by default runs
all)
This avoids
starting the CI &
extra costs

Yes Yes
New git tag CI build PR Ask for
Run CI build Merge PR
created successful? approved? review

git tag contains


the build
No
number
Open new PR
to solve issue

not possible to
deploy unless main
is fixed

Curated by Albert Campillo -10-


check the
DBT | How to Run DBT with Apache Airflow (1/2) LinkedIn post!

Bash Operator Harder to identify problems; inefficient retries


Uses bash to run commands (the old way) Faster DAG generation
Each task is a dbt command Fewer workers (saving resources)
Run all commands at once

pre_dbt_workflow
EmptyOperator

dbt_build
BashOperator
Partial regular refresh
(dim_customers & everything
before)
post_dbt_workflow
EmptyOperator

Run each task individually


pre_dbt_workflow
EmptyOperator

dbt_snapshot
BashOperator

dbt_stg_customers dbt_seed
BashOperator BashOperator

dbt_dim_customers
BashOperator

post_dbt_workflow
EmptyOperator

Curated by Albert Campillo -11-


check the
DBT | How to Run DBT with Apache Airflow (2/2) LinkedIn post!

Dag Operator
Runs inside Cosmos
DbtDag() DbtTaskGroup()

This dag runs of


dim_cusomers and all
of its upstream models
stg_customers &
country_codes_seed)

DbtDag() does not allow joining other tasks


DbtTaskGroup() allows other tasks to be passed as part of the
dag (ie. ‘pre_dbt_workflow’, ‘post_dbt_workflow’)

DbtDag() operator workflow


stg_customers dim_customers

snapshot_customers run test run test


DbtRunLocalOperator DbtRunLocalOperator DbtRunLocalOperator DbtRunLocalOperator DbtRunLocalOperator

country_codes_seed
DbtRunLocalOperator

DbtTaskGroup() operator workflow


pre_dbt_workflow post_dbt_workflow
EmptyOperator EmptyOperator

jaffle_shop_cosmos_dag
stg_customers dim_customers

snapshot_customers run test run test


DbtRunLocalOperator DbtRunLocalOperator DbtRunLocalOperator DbtRunLocalOperator DbtRunLocalOperator

country_codes_seed
DbtRunLocalOperator

Curated by Albert Campillo -12-


check the
DBT | DBT Cheatsheet (1/2) LinkedIn post!

MAIN COMMANDS MAIN FLAGS


dbt build Builds & tests all selected resources (models, seeds, --debug Shows debug-level info in the terminal
snapshots, tests). --fail-fast Makes dbt exit immediately if a resource fails
dbt clean Deletes folders in clean-targets list stated --full-refresh Causes table and seeds ito be recreated
in dbt_project.yml (ie. target). --help Shows available commands and arguments
dbt compile Compiles (but does not run) the models in a project --log-level Limits the info in terminal to the set log level
dbt clone Clones selected nodes from the specified state to --profiles-dir Sets the path for profiles.yml
the target schema(s). --project_dir Sets the path for dbt_project.yml
dbt debug Debugs dbt connections and projects --record-timing-info Saves performance profiling information to file
dbt deps Downloads dependencies for a project --store-failures Makes dbt store failed rows in a table
dbt init Initializes a new dbt project (dbt Core only) --threads Specifies the number of threads to use in the run
dbt list Lists resources defined in a dbt project --vars Supplies variables to the project
dbt parse Parses a project and writes detailed timing info --version Shows dbt version
dbt retry Retries the last run dbt command from the point of failure --warn-error Makes warnings act like errors
dbt run Runs the models in a project
dbt run-operation Invokes a macro
DBT DOCS
dbt seed Loads CSV files into the database
dbt show Previews table rows post-transformation dbt docs generate Generates your projects’ documentation website.
dbt snapshot Executes "snapshot" jobs defined in a project Use the --no-compile argument to skip re-
dbt source Gives tools for working w/ source data compilation
(including validating that sources are "fresh") dbt docs serve Starts a webserver to serve your documentation
dbt test Executes tests defined in a project locally. Use the --port to choose the port

NODE SELECTION
SYNTAX OVERVIEW EXCLUDING MODELS
run --select (-s), --exclude, --selector, --defer bt provides an --exclude flag with the same semantics as --select. Models
test --select (-s), --exclude, --selector, --defer specified with the --exclude flag will be removed from the set of models
seed --select (-s), --exclude, --selector selected with --select.
snapshot --select (-s), --exclude, --selector
Example:
list --select (-s), --exclude, --selector, --resource-type
• $ dbt run --select my_package.*+ --exclude my_package.a_big_model+
compile --select (-s), --exclude, --selector
freshness --select (-s), --exclude, --selector
build --select (-s), --exclude, --selector, --resource-type, --defer
docs generate --select (-s), --exclude, --selector

GRAPH OPERATORS
Plus operator(+)
• $ dbt run --select my_model+ - select my_model and all children
• $ dbt run --select +my_model - select my_model and all parents
• $ dbt run --select +my_model+ - select my_model, and all of its parents and children
N-plus operator
• $ dbt run --select my_model+1 - select my_model and its first-degree children
• $ dbt run --select 2+my_model - select my_model, its first-degree parents, and its second-degree parents ("grandparents")
• $ dbt run --select 3+my_model+4 - select my_model, its parents up to the 3rd degree, and its children down to the 4th degree
At operator(@)
• $ dbt run --models @my_model - select my_model, its children, and the parents of its children
Star operator(*)
• $ dbt run --select finance.base.* - run all of the models in models/finance/base

Curated by Albert Campillo & Bruno Lima -13-


check the
DBT | DBT Cheatsheet (2/2) LinkedIn post!

NODE SELECTION (cont.)

SPECIFYING RESOURCES METHODS EXAMPLES


The --select flag accepts one or more args. Each arg can be one of compatibility. dtag dbt run -s “tag:nightly”
A packages name source dbt run -s “source:snowplow+”
A model name resource_type dbt list -s “resource_type:test”
A fully-qualified path to a directory of models path dbt run -s "models/staging/github"
A selection method (path:, tag:, config:, test_type:, test_name: etc package dbt run -s "package:snowplow"
config dbt run -s "config.materialized:incremental"
Examples test_type dbt test -s "test_type:generic"
• $ dbt run --select my_dbt_project_name - all models in your project test_name dbt test -s "test_name:unique"
• $ dbt run --select my_dbt_model - a specific model state dbt run -s "state:modified" --state path/to/artifacts
• $ dbt run --select path.to.my.models - all models in a specific directory exposure dbt run -s "+exposure:weekly_kpis"
• $ dbt run --select my_package.some_model - a specific model in a specific metric dbt build -s "+metric:weekly_active_users"
package result dbt run -s "result:error" --state path/to/artifacts
• $ dbt run --select tag:nightly - models with the "nightly" tag source_status dbt build -s "source_status:fresher+"
• $ dbt run --select path/to/models - models contained in group dbt run -s "group:finance"
path/to/models access dbt list -s "access:public"
• $ dbt run --select path/to/my_model.sql - a specific model by its path version dbt list -s "version:latest"
unit_test dbt list -s "unit_test:*"

SET OPERATORS
Unions (space-delineated)
• $ dbt run --select +snowplow_sessions +fct_orders$ dbt run --select +my_model - run snowplow_sessions, all ancestors of snowplow_sessions, fct_orders,
and all ancestors of fct_orders)
Intersections (comma-separated)
• $ dbt run --select +snowplow_sessions,+fct_orders - run all the common ancestors of snowplow_sessions and fct_orders
• $ dbt run --select 3+my_model+4 $ dbt run --select marts.finance,tag:nightly - run models that are in the marts/finance subdirectory and tagged nightly

STATE DEFER
Some methods require a manifest file to compare the current state of the project Defer allows you build your project without having to build upstream
with another state, like the state of a previous invocation or the state of the project resources. It requires a state.
in production.
It is commonly used for Slim CI:
The path of this manifest can be passed using the --state flag. dbt build -s “state”modified+” --defer --state path/to/artifacts
dbt build -s “state”modified+” --defer --state path/to/artifacts

Curated by Albert Campillo & Bruno Lima -14-


BRUNO LIMA
dbt Tech Lead | Founder @datagym.io

Bruno is a Data Engineer and dbt tech


Lead at phData, a consulting firm
recognized as dbt Partner of the Year
for two consecutive years.

Bruno is a regular dbt instructor at


DataExpert.io & DataGym.io
founder, a leading dbt learning
platform.

Check the courses & register today

Bruno’s LinkedIn

/in/brunoszdl/
ZACH WILSON
DataExpert.io Founder

Zach is a data engineer and founder of


DataExpert.io, with years of experience
building data systems at Facebook,
Netflix & Airbnb.

He led growth analytics at Facebook,


built security infrastructure at Netflix, and
developed pricing and availability
systems at Airbnb. At DataExpert.io, he
teaches modern data engineering
through hands-on courses, a free
bootcamp & resources trusted by
thousands of learners.

Zach’s LinkedIn

in/eczachly/
ALBERT CAMPILLO
Analytics Engineer | Technical infographist

Albert is a fractional Analytics Engineer


with 15+ years leading data initiatives at
Chanel, PMI & Nestlé across Europe &
APAC.

He creates technical infographics &


carousels, partners with tech creators &
data founders to scale their LinkedIn
presence with content strategy & design
that builds trust, educates buyers & helps
close more deals.

For collaborations, visit his


LinkedIn profile.

Albert’s LinkedIn

in/albertcampillo/

You might also like