ZERO TO DEVELOPER
by Tajamul Khan
PREREQUISITES DAX Comments TIME INTELLIGENCE
Data vs Lookup tables Error Handling Data Table
Primary vs Foreign Key Variables Calendar
Cardinality SCALAR FUNCTIONS Performance Till Date
Filter Flow Aggregate Functions Time Period Shift
CONTENTS
Iterator Functions Running Total Functions
DAX
Introduction Round Functions PERFORMANCE TUNING
Engines Information Functions Performance Analyser
Data Types Logical Function DAX Studio
Operators
CALCULATE
Vertipaq
Calculated Columns Introduction
Pillars
Measures
Evaluation Context
TABLE FUNCTIONS
GOOD PRACTICES Filter Data Functions
Naming Conventions Table Joins
Evaluation Order Relation Functions
DAX Shortcuts
PREREQUISITES
Data Blogs Follow Tajamul Khan
DATA VS LOOKUP TABLE
DATA TABLE
contain measurable values or metrics about the business (quantity, revenue, pageviews)
LOOKUP TABLES
provide descriptive attributes about each dimension in your model (customers, products)
1
Data Blogs Follow Tajamul Khan
PRIMARY KEY VS FOREIGN KEY
FOREIGN KEY
they contain multiple instances of each value, and are used to match the primary keys in
related lookup tables
PRIMARY KEY
they uniquely identify each row of a table and cannot have NULL value
2
Data Blogs Follow Tajamul Khan
CARDINALITY
refers to the uniqueness of values in a column. It can be of many types
One to Many
Many to One
Many to Many
One to One
For our purposes, all relationships in the data model should follow a “ one to many ”
cardinality; one instance of each primary key to many instances of each foreign key
FILTER FLOW
Filter Flow always runs “ downstream ” from Lookup to Data Table
Filters cannot flow “ upstream ” (against the direction)
3
DAX
Data Blogs Follow Tajamul Khan
DAX
DAX also known as Data Analysis Expressions is a functional
language i.e. the execution flows with function calls, It is used in
Power BI
Analysis Services Tabular
Power Pivot
It resembles excel because it was born with PowerPivot
FORMATTING
Code formatting is of paramount importance in DAX.
=SUMX (FILTER ( VALUES ( 'Date'[Year] ), 'Date'[Year] < 2005),
IF ( 'Date'[Year] >= 2000,
[Sales Amount] * 100,
[Sales Amount] * 90) )
4
Data Blogs Follow Tajamul Khan
DAX ENGINES
DAX is powered by two internal engines (formula engine & storage engine) which work
together to compress & encode raw data and evaluate DAX queries
FORMULA ENGINE
Receives, interprets and executes all DAX requests
Processes the DAX query then generates a list of logical steps called a query plan
Works with the datacache sent back from the storage engine to evaluate the DAX query
and return a result
STORAGE ENGINE
Compresses and encodes raw data, and only communicates with the formula engine
(doesn’t understand the DAX language)
Receives a query plan from Formula Engine, executes it, and returns a datacache
5
Data Blogs Follow Tajamul Khan
DAX ENGINES WORKING
6
Data Blogs Follow Tajamul Khan
DATA TYPES DAX TYPE HANDLING
Integer (64 bit) Whole Number
OPERATOR OVERLOADING
Decimal (Floating Point) Operators are not strongly typed
Currency (Fixed Decimal Number) The result depends on the inputs
Date, DateTime, Time example:
TRUE / FALSE (Boolean)
"4" + "8" = 12
String (Unicode String) 4 & 8 = "48"
Pay attention to undesired conversions
Data types represent how values are stored by the DAX storage engine
7
Data Blogs Follow Tajamul Khan
DAX OPERATORS
8
Data Blogs Follow Tajamul Khan
VERTIPAQ
VertiPaq uses a columnar data structure , which stores data as individual columns (rather
than rows or full tables) to quickly and efficiently evaluate DAX queries
Encoding is used to reduce the amount of memory needed to evaluate a DAX
VALUE ENCODING
Mathematical process used to reduce the number of bits needed to store integer values
HASH ENCODING
Identifies the distinct string values and creates a new table with indexes
RUN LENGTH ENCODING
Reduces the size of a dataset by identifying repeated values found in adjacent rows
9
Data Blogs Follow Tajamul Khan
CALCULATED COLUMNS
Allow you to add new, formula based columns to tables
Values are calculated based on information from each row of a table (has row context
Appends static values to each row in a table and stores them in the model ( which
increases file size).
Recalculate on data source refresh or when changes are made to component columns
Primarily used as rows , columns , slicers or filters
10
Data Blogs Follow Tajamul Khan
MEASURES
Values are calculated based on information from any filters in the report (has filter context
Does not create new data in the tables themselves
Recalculate in response to any change to filters within the report
Almost always used within the values field of a visual
Measure can’t be called a calculated measure, just call them measure!
11
Data Blogs Follow Tajamul Khan
MEASURES VS CALCULATED COLUMNS
Values are calculated based on information Values are calculated based on
from any filters in the report (has filter information from each row of a table (has
context) row context)
Does not create new data in the tables Appends static values to each row in a
themselves doesn’t increase file size table and stores them in the model (
which increases file size
Recalculate in response to any change to Recalculate on data source refresh or
filters within the report when changes are made to component
columns
Almost always used within the values field of Primarily used as rows , columns , slicers
a visual or filters
12
Data Blogs Follow Tajamul Khan
EVALUATION CONTEXT
Evaluation contexts are the pillars of DAX i.e., Filter and Row
Filter Context TotalSales = SUMX ( Sales, Sales[Quantity] * Sales[Net Price] )
Filters tables
Row Context
Iterates rows
13
Data Blogs Follow Tajamul Khan
FILTER CONTEXT
Filter context filters the tables in your data model
DAX creates filter context when dimensions are added to rows , columns , slicers & filters
CALCULATE can be used to systematically create or modify existing filter context
Filter context always travels (propagates) from the ONE side to the MANY side of a table
relationship
14
Data Blogs Follow Tajamul Khan
ROW CONTEXT
Row context iterates through the rows in a table
DAX creates row context when you add calculated columns to your data model
Iterator functions (SUMX, RANKX, etc.) use row
context to evaluate row level calculations
Row context doesn't automatically propagate through table relationships (need to use
RELATED or RELATEDTABLE functions)
15
GOOD PRACTICES
Data Blogs Follow Tajamul Khan
NAMING CONVENTIONS
Measures should not belong to a table
• Avoid table name
• [Margin%] instead of Sales[Margin%]
• Easier to move to another table
• Easier to identify as a measure
Use this syntax when to reference:
Columns → Table[Column]
Measures → [Measure]
16
Data Blogs Follow Tajamul Khan
EVALUATION ORDER
is the process by which DAX evaluates the parameters in a function
NON NESTED
IF( Test, True, False )
1 2 3
NON NESTED
SUMX
FILTER(
FILTER ( ‘Table’
RELATED ( ‘Table’[Column], 1 = Inner Most Filter
RELATED( ‘Table’[Column]), 2
‘Table’[Column]), 3 = Outer
17
Data Blogs Follow Tajamul Khan
DAX SHORTCUTS
18
Data Blogs Follow Tajamul Khan
COMMENT YOUR CODE
Comments can help other users interpret your code, and can be particularly helpful for
complex queries with multiple lines, nested functions, etc.
Single line Comment = -- or //
Multi line Comment = /* ..... */
Bad comment
Total Sales = SUM(Sales[SalesAmount]) --Sum the sales amount
Good comment
Total Sales = SUM(Sales[SalesAmount]) --Calculate total sales amount
19
Data Blogs Follow Tajamul Khan
ERROR HANDLING
Error handling functions can be used to help identify missing data, and can be
particularly useful for quality assurance and testing
IFERROR()
Returns a value if first expression is an error and the value of
the expression itself otherwise
ISBLANK()
Checks to see if a value is blank, returns True or False
20
Data Blogs Follow Tajamul Khan
VARIABLES
Very useful to avoid repeating subexpressions in your DAX code.
Variables can be a helpful tool for testing or debugging your DAX code
Debug Complex Measure =
VAR SalesAmount = SUM(Sales[SalesAmount])
VAR CostAmount = SUM(Sales[Cost])
VAR Profit = SalesAmount - CostAmount
RETURN
Profit
21
SCALAR FUNCTIONS
Data Blogs Follow Tajamul Khan
COMMON SCALAR FUNCTIONS
function that operates on a single value (or scalar) and returns a single value as a result.
AGGREGATE ROUND INFORMATION LOGICAL
22
Data Blogs Follow Tajamul Khan
AGGREGATE FUNCTIONS
Functions that can be used to dynamically aggregate values within a column
23
Data Blogs Follow Tajamul Khan
ITERATOR FUNCTIONS
known as iterator functions, Iterate over the table and evaluate the expression for each row
In reality, SUM is nothing but syntax sugar for SUMX
24
Data Blogs Follow Tajamul Khan
ROUND FUNCTIONS
Functions that can be used to round values to different levels of precision
25
Data Blogs Follow Tajamul Khan
INFORMATION FUNCTIONS
Functions that can be used to analyze the data type or output of an expression
26
Data Blogs Follow Tajamul Khan
LOGICAL FUNCTIONS
Functions for returning information about values in a conditional expression
27
CALCULATE
Data Blogs Follow Tajamul Khan
CALCULATE
It is one of the most powerful and versatile functions. It allows you to modify the existing filter
context of a calculation, enabling you to perform complex calculations and aggregations.
28
Data Blogs Follow Tajamul Khan
PILLARS OF CALCULATE FUNCTION
EXPANDED TABLES
An expanded table consists of the base table (which is visible to the user), along with columns
from any related table connected via a 1-to-1 or many-to-1 relationship
CONTEXT TRANSITION
Context Transition is the process of turning row context into filter context
By default, calculated columns understand row context but not filter context
To create filter context at the row-level, you can use CALCULATE
29
Data Blogs Follow Tajamul Khan
EVALUATION ORDER
MODIFIERS
Modifiers are used to alter the way CALCULATE creates filter context, and are added as filter
arguments within a CALCULATE function
Calculate Modifiers Only
30
TABLE FUNCTIONS
Data Blogs Follow Tajamul Khan
COMMON TABLE/FILTER FUNCTIONS
FILTER DATA ADD DATA CREATE DATA
31
Data Blogs Follow Tajamul Khan
ALL FUNCTION
Returns all the rows in a table, or all the values in a column, ignoring any filters
ALL is both a table filter and a CALCULATE modifier
Removes initial filter context
IGNORE FILTER
Does not accept table expressions
(only physical table references)
Returns Table
32
Data Blogs Follow Tajamul Khan
FILTER FUNCTION
Returns a filtered table, based on one or more filter expressions
FILTER is both a table function and an iterator
FILTER TABLE
Often used to reduce the number of rows to scan
Returns Table
33
Data Blogs Follow Tajamul Khan
DISTINCT
NumOfProducts =
Returns the unique values of a column COUNTROWS ( DISTINCT(
only the ones visible in the current filter context. Product[ProductCode] ))
VALUES
Returns the unique values of a column, NumOfProducts =
only the ones visible in the current filter context, COUNTROWS ( VALUES(
including the additional blank row if it is visible in the Product[ProductCode] ))
filter context.
Use DISTINCT to create new dimension table by extracting unique values from fields in data table!
34
Data Blogs Follow Tajamul Khan
DISTINCT VS VALUES VS ALL
VALUES will always show the blank row but DISTINCT will not
35
Data Blogs Follow Tajamul Khan
SELECTEDVALUE
SELECTEDVALUE is a convenient function that simplifies retrieving the value of a column, when
only one value is visible.
SELECTEDVALUE (
'Product Category'[Category],
"Multiple values"
)
Equivalent to:
IF ( HASONEVALUE ( 'Product
Category'[Category] ), VALUES ( 'Product
Category'[Category] ), "Multiple values" )
36
Data Blogs Follow Tajamul Khan
ALLEXCEPT
The ALLEXCEPT function in DAX is used to remove all context filters in a table except for the
filters specified in the function arguments.
Our slicer is an external filter
37
Data Blogs Follow Tajamul Khan
ALLSELECTED
ALLSELECTED() returns all rows in a table or values in a column, ignoring inner filters i.e.,
specified in the visual but respecting other existing filter context.
Our slicer is an external filter
38
Data Blogs Follow Tajamul Khan
ADD DATA FUNCTIONS
Functions used to specify or add columns based on existing data in the model
39
Data Blogs Follow Tajamul Khan
CREATE DATA FUNCTIONS
Functions used to specify or add columns based on existing data in the model
40
TABLE JOINS
Data Blogs Follow Tajamul Khan
41
RELATION FUNCTIONS
Data Blogs Follow Tajamul Khan
TERMINOLOGY
PHYSICAL TABLE VS VIRTUAL TABLE
Physical relationships are manually created, and visible in your data model
Virtual relationships are temporary, and defined using DAX expressions
PHYSICAL RELATION VS VIRTUAL RELATION
There are two key types of table relationships: PHYSICAL and VIRTUAL
Physical relationships are manually created, and visible in your data model
e.g., using data modelling
Virtual relationships are temporary, and defined using DAX expressions
e.g., using Treatas
42
Data Blogs Follow Tajamul Khan
RELATIONSHIP FUNCTIONS
43
TIME INTELLIGENCE
Data Blogs Follow Tajamul Khan
DATE TABLE
Date Table is very important for Time Intelligence
If you import or create your own date table, it must meet these requirements:
Must contain all the days for all years represented in your fact tables
Must have at least one field set as a Date or DateTime datatype
Cannot contain duplicate dates or datetime values
If using a time component within a date column, all times must be identical (i.e. 12:00)
Should be marked as a date table (not required but a best practice)
44
Data Blogs Follow Tajamul Khan
BUILDING RESUABLE DATE TABLE
DateTable =
ADDCOLUMNS (
CALENDAR (DATE(2020, 1, 1), DATE(2030, 12, 31)),
"Year", YEAR([Date]),
"Month", FORMAT([Date], "MMMM"),
"Month Number", MONTH([Date]),
"Quarter", "Q" & QUARTER([Date]),
"Weekday", FORMAT([Date], "dddd"),
"Weekday Number", WEEKDAY([Date], 2), -- where 1 = Sunday, 2 = Monday
"Is Weekend", IF(WEEKDAY([Date], 2) > 5, TRUE(), FALSE()),
"Is Holiday", FALSE() -- You can replace this with logic to identify holidays )
45
Data Blogs Follow Tajamul Khan
CALENDAR
Returns a table with one column of all dates between start and end date
46
Data Blogs Follow Tajamul Khan
CALENDARAUTO()
Returns a table with one column of dates based on a fiscal year end month. The Range of dates
is calculated automatically based on data in the model.
Calendarauto(6) means it starts from 01/07
47
Data Blogs Follow Tajamul Khan
DATE FORMATTING
Use the FORMAT function to specify date/time formatting. Common examples include:
48
Data Blogs Follow Tajamul Khan
TIME INTELLIGENCE FUNCTIONS
Time Intelligence functions allow you to define and compare custom time periods
PERFORMANCE PERIOD SHIFT RUNNING TOTAL
49
Data Blogs Follow Tajamul Khan
PERFORMANCE TILL DATE FUNCTIONS
Functions commonly used to calculate performance through the current date
50
Data Blogs Follow Tajamul Khan
TIME PERIOD SHIFT FUNCTIONS
Functions commonly used to compare performance between specific periods
51
Data Blogs Follow Tajamul Khan
RUNNING TOTAL FUNCTIONS
Functions commonly used to calculate running totals or moving averages
52
PERFORMANCE TUNING
Data Blogs Follow Tajamul Khan
PERFORMANCE ANALYZER
Power BI’s Performance Analyzer can help us troubleshoot issues, measure load times for
visuals/DAX queries, and optimize your code
Power BI Desktop’s Performance Analyzer records user actions (like Excel’s macro recorder),
and tracks the load time (in milliseconds) for each step in the process:
53
Data Blogs Follow Tajamul Khan
DAX STUDIO
DAX Studio is a free tool that allows you to connect to your Power BI data model to test and
optimize your DAX queries
DAX Studio
54
FREE
DATA
RESOURCES
FREE PROJECTS
FREE FREE FREE
MACHINE EDA STATISTICS
LEARNING PROJECTS PROJECTS
PROJECTS
Download projects for your portfolio!
55
FREE BOOKS
Statistics EDA SQL Excel
Download your copy now!
54
FREE RESOURCES
Notes & Tips Free Blogs Free Projects
Follow to stay updated!
54
Drop your Review!
Tajamul Khan