Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
20 views370 pages

Dsc354-Dwbi Lab Manual Sp24 v2.0

The lab manual for DSC354 - Data Warehousing and Business Intelligence outlines the course content, including data warehouse setup, schemas, ETL processes, and business intelligence tools like PowerBI and Tableau. It specifies student outcomes and intended learning outcomes aligned with Bloom's Taxonomy, and details the assessment policy for lab work. Additionally, the manual provides a structured list of labs and activities, including installation procedures for SQL Server and related tools.

Uploaded by

sumrunkhan904
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views370 pages

Dsc354-Dwbi Lab Manual Sp24 v2.0

The lab manual for DSC354 - Data Warehousing and Business Intelligence outlines the course content, including data warehouse setup, schemas, ETL processes, and business intelligence tools like PowerBI and Tableau. It specifies student outcomes and intended learning outcomes aligned with Bloom's Taxonomy, and details the assessment policy for lab work. Additionally, the manual provides a structured list of labs and activities, including installation procedures for SQL Server and related tools.

Uploaded by

sumrunkhan904
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 370

Lab Manual

DSC354- Data Warehousing


and Business Intelligence

CUI

Department of Computer Science


0
Islamabad Campus
Lab Contents:
The topics include Data Warehouse: Setting up the Working Environment; Data Warehouse Schemas; Creating views and
indexes on Data warehouse; Dimensional Model; Demonstration of ETL process; Business Intelligence: Data insights and
visualization using PowerBI and Tableau.
Student Outcomes (SO)
S.# Description
Design and evaluate solutions for complex computing problems, and design and evaluate systems, components, or
3 processes that meet specified needs with appropriate consideration for public health and safety, cultural, societal,
and environmental considerations
Create, select, adapt and apply appropriate techniques, resources, and modern computing tools to complex
4
computing activities, with an understanding of the limitations
5 Function effectively as an individual and as a member or leader in diverse teams and in multi-disciplinary settings
Recognize the need, and have the ability, to engage in independent learning for continual development as a
9
computing professional
Intended Learning Outcomes
Blooms Taxonomy
Sr.# Description SO
Learning Level
Perform the data warehousing, OLAP and data mining tasks using modern
CLO -5 Creating 3-4
tools.
Implement Business Intelligence Techniques on a Data
CLO -6 Applying 3-4
Warehouse.
Lab Assessment Policy
The lab work done by the student is evaluated using rubrics defined by the course instructor, viva-voce, project
work/performance. Marks distribution is as follows:
Lab Mid Term Lab Terminal
Assignments Total
Exam Exam
25 25 50 100
Note: Midterm and Final term exams must be computer based.

1
List of Labs
Lab # Main Topic Page
#
Lab 01 Set up the Working Environment 4
Lab 02 Data Warehouse Schemas: Star, Snowflake, Fact Constellation 20
Lab 03 Creating views and indexes on Data warehouse 27
Lab 04 Conversion of Entity Relationship Diagram(ERD) to Dimensional Model 33
(DM): Working with Sample database Sakila
Lab 05 Demonstration of ETL Tool: SQL Server Integration services-SSIS 40
(Extraction and Loading)
Lab 06 Demonstration of ETL tool : SQL Server Integration services-SSIS 59
(Transformation)
Lab 07 Creating ROLAP in SQL Server Analysis Services (SSAS) 79
Lab 08 Get Start with Power BI Desktop 96
Lab 09 Mid Term Exam
Lab 10 Preparing Data in Power BI Desktop 127
Lab 11 Transformation using Power BI 142
Lab 12 Data Modeling in Power BI Desktop 169
Lab 13 Using DAX in Power BI Desktop 201
Lab 14 Designing a Report in Power BI Desktop 239
Lab 15 Creating a Power BI Dashboard and Data Analysis 285
Lab 16 Get Started with Tableau Desktop -part1 315
Lab 17 Working with Tableau -Part2 332
Final Term Exam

2
Lab 01
Set up the Working Environment

Objective:
The objective of this lab is to set the development environment for creating data warehouse.
Activity Outcomes:
The activities provide hands - on practice with the following topics
• Install SQL SERVER 2019-Enterprise Edition
• Install SQL SERVER Management Studio (SSMS)
• Install and integrate SQL SERVER Data Tool (SSDT)
Instructor Note:
As pre-lab activity “See SQL Server installation guide” by Microsoft Docs.
https://docs.microsoft.com/en-us/sql/database-engine/install-windows/install-sql-server?view=sql-server-
ver15

1) Useful information
Microsoft SQL Server
Microsoft SQL Server is a relational database management system (RDBMS) that supports a wide variety
of transaction processing, business intelligence and analytics applications in corporate IT environments.
Microsoft SQL Server is one of the three market-leading database technologies, along with Oracle Database
and IBM's DB2.
SQL Server services, tools and editions
Microsoft also bundles a variety of data management, business intelligence (BI) and analytics tools with
SQL Server. In addition to the R Services and now Machine Learning Services technology that first
appeared in SQL Server 2016, the data analysis offerings include SQL Server Analysis Services, an
analytical engine that processes data for use in BI and data visualization applications, and SQL Server
Reporting Services, which supports the creation and delivery of BI reports.

On the data management side, Microsoft SQL Server includes SQL Server Integration Services, SQL Server
Data Quality Services and SQL Server Master Data Services. Also bundled with the DBMS are two sets of
tools for DBAs and developers: SQL Server Data Tools, for use in developing databases, and SQL Server
Management Studio, for use in deploying, monitoring and managing databases.

2) Solved Lab Activites

Sr. No Allocated Time Level of Complexity CLO Mapping

Activity 1 30 minutes (May vary due to Low CLO-5


system and internet speed)

3
Activity 2 1 hour (May vary due to Low CLO-5
system and internet speed)
Activity 3 30 minutes (May vary due to Low CLO-5
system and internet speed)

Activity 1:
This activity demonstrate the steps to be followed to install SQL server, SSMS, SDT on the system.

Solution:

Downloading SQL Server Installer


Microsoft SQL Server Download site
You will find the Developer addition download from the Microsoft SQL server download site.

https://www.microsoft.com/en-us/sql-server/sql-server-downloads
Once the download is complete, go to the destination folder (i.e. downloads folder on your
computer). The installation file will look something like this:

Click on the install file to begin the install process.

Installing SQL Server Developer Edition


1. Once the installation starts, you will be presented with installation type. We will focus on the

Custom install and explain various features of the installation.

Figure 1.1: Installation step one

4
2. Choose the Media Location path. Note the minimum free space and download size and press
Install.

Figure 1.2: Installation step two

3. Once the SQL Server Installation Center launches choose Installation tab (second from the right).
4. In most cases you will want to run a New SQL Server New SQL Server stand-alone installation, but
other options are available, for example if you have a previous version of SQL Server installed, you
have an option to update.

Figure 1.3: Installation step three

5
5. On the Product Key page make sure that the selected Edition is “Developer” click Next.

Figure 1.4: Installation step four

6. On the License Terms page, check the box next to “I accept the license terms” and click Next.
7. Setup will check if needed install Setup Support Files. Click Next when complete.

Feature installation:

1. Select the components of SQL server to install on your computer.


Select Database Engine Services, this is the minim requirement to use SQL Server

• For CS779 in addition to what is listed above please review these descriptions to see
which features you might be interested in for advanced topics for the term project.
• Instance root Directory and Shared Features Directory: Note the paths where SQL
server will install the components (default is Program Files folder within C
drive.)nstance Configuration

6
2. Generally, you can leave the Default Instance and the default Instance ID. The Named
instances would be used if you want to create multiple instances of SQL Server on the same
machine. Click Next when complete.

Figure 1.5: Installation step five

Server Configuration

1. Review Server Configuration options.


a. Account Names: We suggest that you leave these set to defaults provided by the installer as outlined
below.
2. Startup types: If you would like to have SQL Server running at all times on your computer, the Startup
Type should be Automatic (which is the default) otherwise you can set it to Manual and start it when
you need to use SQL Server so that it does not take up system resources such as RAM. Leave the other
services to default (Manual & Disabled).

7
Figure 1.6: Server Configuration

• A few additional detailed explanations:


b. SQL Server Agent is used for running scheduled jobs, such as backups, scheduled SQL scripts
and db maintenance. If this was a production environment you would want this service set to
automatic.
c. You will need SQL Server Database Engine to run SQL Server. Since DBMS uses a lot of system
recourses, we would recommend running it manually when you need it.
d. If you installed other components for SQL Server for advanced topics, you should also set them to
manual so that they don’t run on system startup.
e. SQL Server Browser can be left disabled.
f. You do not need to select Grant Perform Volume Maintenance Task privilege to SQL Server for
the course, however if performing installs in production environments this is recommended for data
confidentiality. Note the link on the page for additional details.
g. Check the collation tab at the top. For our purposes this can be left at default,
SQL_Latin1_General_CP1_CI_AS, which is Latin1-General case insensitive accent sensitive.
Collation defines the sorting rules, case and accent sensitivity for character data, for example you
can choose a different language or set it to be case sensitive. Some applications require for you to
choose a specific collation. You can click Customize to change it. Click Next when done.

8
Figure 1.7: Server Configuration

Database Engine Configuration

Server Configuration:
• Authentication mode:
• Windows authentication: will only use your windows account privileges to connect to SQL
Server.
• Mixed mode: adds a local SQL system administrator (SA) account IMPORTANT: We
highly recommend using Mixed Mode so that there is an additional built in SA account with
a separate user name and password as well as your built-in windows account in case you
have issues logging in.
• IMPORTANT: Make sure to add users (such as your account) to SQL Server Administrators (click
on Add Current User) if it is not already there.
• These accounts will allow you to log into SQL Server.
• Note that the server itself does not need these accounts and runs as a service which you specified in
previous step.

9
Figure 1.8: Database Engine Configuration

• A few additional detailed explanations:


a. You can leave Data Directories to defaults. Data Directories can be changed if you
have a multiple disk environment and for performance want to separate out where
different parts of the DBMS go. For example, in production environments the LOG
components should go on a separate disk array, which will improve performance of
the system.
• For additional tuning you can explore the TempDB, Max Degree of Parallelism and Memory
settings. TempDB system database used by SQL Server. For additional details please review
the following link: https://msdn.microsoft.com/en-us/library/ms190768.aspx As an example,
this page allows you to customize auto-growth settings for the TempDB. For the courses
leaving the defaults is fine.
• If you are installing SQL Server for CS779 you might want to enable FILESTREAM if you
plan to explore large file types such as Binary language objects (BLOB). As with many other
features, this can be enabled at a later time.

10
Error Reporting, Installation Configuration Rules, & Ready to Install

• Review selected features and


click Install, Installation will
begin, this will take some time.

Figure 1.9: Ready to Install

• This will take some time

• Once the installation is


complete Congratulations
SQL Server Install is
complete, click close.

Figure 1.10: Installation Progress bar

SQL Server Management Tools


You will need SQL Server Management Tools to work with SQL Server, this is the user interface that
include components such as the Query interface as well as components for advanced topics such as analysis
and integration services as well as the database tuning advisor. SQL Server, like other modern relational
databases, uses a client-server architecture. The database itself is the server and contains all of the data and
the capability to add, modify, delete, and access the data. A client is needed to connect to the database and
perform specific commands. The most popular client by far for SQL Server is SQL Server Management
Studio (SSMS), which you will install in this section. SSMS is very capable and provides many powerful

11
conveniences and capabilities.

It is required that you install the Management Tools Complete for all courses.

Figure 1.11: SSMS Selection for Installation

Activity 2:
This activity demonstrate the steps to be followed to install SQL Server Management Studio (SSMS) on
the system.

Solution:

Download SQL Server Management Studio (SSMS)

You will be brought to a web page to download the latest release of SQL Server Management Studio. Click
on the link to download the latest release and save the file to a location you can remember.

Install SQL Server Management Studio (SSMS)


Once downloaded, run the SSMS installer. The first screen that appear is shown below.

12
Figure 1.12: SSMS Installation step one

Click the “Install” button to begin. A progress screen will appear similar to the following.

Figure 1.13: SSMS Installation step two

Let it progress through until completion, then you will see a screen indicating successful setup, click close.
Congratulations! SSMS is now installed.

Working with and connecting to SQL Server


You have installed both SQL Server and SSMS. There are just a few more steps you need in order to start
using your database to complete assignments -- connecting to your database and creating a database for

13
assignments.

Starting & Stopping SQL Server (optional)

IMPORTANT: If during setup you selected for SQL Server to start manually then you will need to start
SQL Server services. Click on Search at the bottom of the Winds screen and type in Services in the search

Figure 1.13: SSMS Installation step three

• Scroll down the list until you see the SQL Server services.

Figure 1.14: SSMS Installation step four

• Start the following service: SQL Server (Instance Name)

14
Figure 1.15: SSMS Installation step five

• SQL Server service should now show that it is running

Figure 1.16: SSMS status

Notes:

• When you are no longer using SQL Server, you can shut the service down to save on system resources.
• You can also change the startup type to be automatic while the course is running to save you the step
of turning this on and off.
• You may want to put the services shortcut to your desktop for quick access

Starting SQL Server Management Studio

• To work with SQL Server, you will use the SQL


Server Management Studio. You will find it
under Microsoft SQL Server Tools program group
or type in in the Windows Search bar.

• You may want to put the SQL Server Management Studio shortcut to your desktop or pin it to the
Windows Task bar for quicker access.

Connecting SQL Server

• In the Connect to Server dialog box:


o Server Type: Database Engine (default)
o Server Name: This is your system name (default).
o Authentication: Use
▪ Windows Authentication (default) and
your account OR
▪ The SQL Server Authentication with Login: SA and password which
you created during the install and click Connect.

15
Figure 1.18: Connecting SQL server

You have just connected to your database through SQL Server Management Studio!

Activity 3:
This activity demonstrate the steps to be followed to install SQL Server Data Tools (SSDT) on the system.

Solution:

Installing SQL Server Data Tools (SSDT)


SQL Server Data Tools (SSDT) is a modern development tool for building SQL Server relational databases,
databases in Azure SQL, Analysis Services (AS) data models, Integration Services (IS) packages, and
Reporting Services (RS) reports. With SSDT, you can design and deploy any SQL Server content type with
the same ease as you would develop an application in Visual Studio.

Install SSDT with Visual Studio 2019


If Visual Studios 2019 is already installed, you can edit the list of workloads to include SSDT. If
you don’t have Visual Studio 2019 installed, then you can download and install Visual Studios
2019 .
To modify the installed Visual Studio workloads to include SSDT, use the Visual Studio Installer.
1. Launch the Visual Studio Installer. In the Windows Start menu, you can search for "installer".

16
Figure 1.19: SDT Installation step one

2. In the installer, select for the edition of Visual Studio that you want to add SSDT to, and then
choose Modify.
3. Select SQL Server Data Tools under Data storage and processing in the list of workloads.

Figure 1.20: SDT Installation step two

17
3) Graded Lab Tasks
Note: The instructor can design graded lab activities according to the level of difficult and complexity of
the solved lab activities. The lab tasks assigned by the instructor should be evaluated in the same lab.

Lab Task 1
Students are required to install the required development environment before starting the lab activities.

18
Lab 02
Data Warehouse Schemas: Star and Snowflake schema

Objective:
The objective of this lab is to demonstrate various Data Warehouse Schemas including Star and Snowflake
Activity Outcomes:
The activities provide hands - on practice with the following topics
• Install the sample data.
• Query and test sample data (AdventureWorks).
• Understand and design Modeling dimension tables
Instructor Note:
As pre-lab activity, read chapter1 from the text book “Data Mining and Data Warehousing:
Principles and Practical Techniques, Parteek Bhatia, Cambridge University Press, 2019”.

1) Useful Concepts

Introduction to sample data (AdventureWorks database)


In order to demonstrate the product features of SQL Server, Microsoft provides a sample database of virtual
business scenarios.
The AdventureWorks database provided from SQL Server 2005 has introduced a virtual Adventure Works
Cycles company. This company and its business scenarios, employees, and products are the basis of the
following sample database:

▪ AdventureWorks sample OLTP database


▪ AdventureWorksDW sample data warehouse
▪ AdventureWorksAS sample Analysis Services database

Adventure Works Cycles, the fictitious company on which the AdventureWorks sample database is
based, is a large multinational production company. The company produces bicycles made of metal
and composite materials. The products are exported to North America, Europe and Asia. The
company is headquartered in Birthall, Washington, has 290 employees, and has multiple regional
sales teams active around the world.
In 2000, Adventure Works Cycles purchased Importadores Neptuno, a small production plant in
Mexico. Importadores Neptuno produces a variety of key sub-components for Adventure Works
Cycles products. These sub-assemblies will be shipped to Bossel for final product assembly. In 2001,

19
Importadores Neptuno transformed into a manufacturer and seller focusing on touring mountain
bike products.
After achieving a successful financial year, Adventure Works Cycles hopes to expand its market
share by focusing on providing products to high-end customers, expanding its product sales channels
through external websites, and cutting its sales costs by reducing production costs.

2) Solved Lab Activites

Sr. No Allocated Time Level of Complexity CLO Mapping

Activity 1 15 minutes (May vary due to Low CLO-5


internet speed)
Activity 2 20 minutes Low CLO-5

Activity 1:
Finding and installing the sample data
Load up some sample data. The sample data used is the AdventureWorks Database, specifically, the Data
Warehousing version of the AdventureWorks Database.
Goto https://docs.microsoft.com/en-us/sql/samples/adventureworks-install-configure?view=sql-server-
ver15&tabs=ssms
Select the appropriate version like in your case choose AdventureWorksDW (any version).

Figure 2.1: List of different versions of Sample data

To restore your database in SQL Server Management Studio, follow these steps:

20
1. Download the appropriate .bak file from one of links( provided in the download backup files section)
shown in above picture.
2. Move the .bak file to your SQL Server backup location. This varies depending on your installation
location, instance name and version of SQL Server. For example, the default location for a default
instance of SQL Server 2019 is:

C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\MSSQL\Backup.

3. Open SQL Server Management Studio (SSMS) and connect to your SQL Server.
4. Right-click Databases in Object Explorer > Restore Database... to launch the Restore
Database wizard.

Figure 2.2: SDT Restoring sample database step one

5. Select Device and then select the ellipses (...) to choose a device.
6. Select Add and then choose the .bak file you recently moved to the backup location. If you moved your
file to this location but you're not able to see it in the wizard, this typically indicates a permissions issue
- SQL Server or the user signed into SQL Server does not have permission to this file in this folder.
7. Select OK to confirm your database backup selection and close the Select backup devices window.

21
8. Check the Files tab to confirm the Restore as location and file names match your intended location
and file names in the Restore Database wizard.
9. Select OK to restore your database.

Figure 2.3: SDT Restoring sample database step two

Expand the tree option for databases, AdventureWorksDW2019 will appear.

Activity 2:
Techniques for Modeling dimension tables
There's two primary techniques star and snowflake.
Star Schema
Look at the Adventureworks DW 2012 database, expand to database diagrams. Locate to finance, and in
here you will see in the middle of the screen, one fact table called FactFinance, and that is surrounded by
five-dimension tables.
This is a common design to have multiple dimensions reference in the same fact table, and in particular,
these dimension tables are not related to one another and not related to other dimension tables. They are
very simple. All of the information about a dimension is contained in one table. This is called the star
design.

22
Figure 2.3: Example of Star SchemaS

Snowflake Schema
The other diagram is Internet Sales. Here factInternetSales table is in the middle. Then off to the right, a
connection to other dimension tables, and those dimension tables have relationships to one another, and this
allows to break down a dimension into more detail, and give more options on filtering, sorting, and
searching.
However, this is a more complex design. It forces to write bigger queries, more robust queries, to get the
same data out of the database. It also can be a performance hit. This will create a lot more joins and joining
two large tables can be a very expensive process. Most of our dimension tables probably shouldn't be too
large, but you do get into some scenarios with some large dimension tables. If we look at the Reseller Sales
diagram, we also see some relationships again between the different dimension tables similar to what in the
Internet Sales.

23
Figure 2.4: Example of SnowFlake Schema

The way schema is laid out, it could be argued that this looks like the branches of a snowflake. The
dimension tables are branching off into various directions, but then those branches come back and connect
to one another, and that looks a little bit like a snowflake. Therefore, this is called the snowflake technique.
So, we have two primary techniques of structure in our dimension tables. The star technique, which is very
simple. Each dimension is stored in its entirety in one table, and then contrast that with the snowflake
technique.
In the snowflake technique, a dimension is split up amongst multiple tables, each one has some advantages
and disadvantages. The star is the simpler way to go, and therefore will typically give better performance,
and is easier to write queries against. The snowflake is a more complex design, will be more difficult to
write queries, will possibly give slower performance, but it could allow us to have more robust dimensions.
Realistically, most data warehouses have some of both.
It's very rare to see a data warehouse that's 100% star or 100% snowflake. Typically, some of your
dimensions can easily be captured in one table, and you can go ahead and use the star method there, and
other dimensions just logically require multiple tables and you can use the snowflake technique there. So,
you can mix these two techniques, and that is very common.

3) Graded Lab Tasks (1 hour)


Note: The instructor can design graded lab activities according to the level of difficult and complexity of
the solved lab activities. The lab tasks assigned by the instructor should be evaluated in the same lab.

24
Lab Task 1
Students are required to explore the sample data (AdventureWorks Data Warehouse) which include tables,
schemas and data available in sample data warehouse.

Lab Task 2
Design following SQL Queries, Run them and show output.
a. For every customer with a 'Main Office' in Dallas show AddressLine1 of the 'Main Office' and
AddressLine1 of the 'Shipping' address - if there is no shipping address leave it blank. Use one row per
customer.
b. For each order show the SalesOrderID and SubTotal calculated three ways:
A) From the SalesOrderHeader
B) Sum of OrderQty*UnitPrice
C) Sum of OrderQty*ListPrice
c. Show the best selling item by value.
d. Show how many orders are in the following ranges (in $):

RANGE Num Orders Total Value


0- 99
100- 999
1000-9999
10000-

e. Identify the three most important cities. Show the break down of top level product category against city.

Note: students are required to submit the queries along with their results.

25
Lab 03
Creating views and indexes on Data warehouse

Objective:
The objective of this lab to create views for the improvement in the implementation of data warehouse.
This lab will help to create different views of data warehouse.
Activity Outcomes:
The activities provide hands - on practice with the following topics
• Create views
• Create indexes on views
Instructor Note:
As pre-lab activity, read chapter1 from the text book “Data Mining and Data Warehousing: Principles and
Practical Techniques, Parteek Bhatia, Cambridge University Press, 2019”.

1) Useful Concepts

Views:
A view is created by combining data from different tables. Hence, a view does not have data of itself.
On the other hand, Materialized view usually used in data warehousing has data. This data helps in decision
making, performing calculations etc. The data stored by calculating it before hand using queries.
When a view is created, the data is not stored in the database. The data is created when a query is fired on
the view. Whereas, data of a materialized view is stored.
2) Solved Lab Activites (Allocated Time 1 Hr.)

Sr. No Allocated Time Level of Complexity CLO Mapping

Activity 1 30 minutes Medium CLO-5

Activity 2 30 minutes Medium CLO-5

Activity 1:

Creating view to improve implementation of data warehouse


Now let’s see how SQL server views can improve the implementation of data warehouse. The first situation
we select dimGeography, that has a snowflake relationship to another dimension called dimsalesterritory.
Both of these dimensions are interesting in and of themselves, but they're also interesting when combine
the data from the two tables. It is easier to create a view that pulls in a data from each table allowing us just
to have one entity to look at all of the data. First lets look at the geography dimension. Right click and
select the top rows.

26
Figure 3.1: Data retrieval from geography dimension table

You may noticed the geography dimension, contains information about countries, cities, states. Scroll all
the way to the right, you will see it is linked by key to the sales_territory.
Right click on salesterritory table and look at the top rows and observe it contain the Northwest United
States, Northeast United States, Central United States. Different sales territories, but this doesn't tell us
what cities are in what territory.
In order to know what cities, or in what territory we need to look at both dimension sales territory, and the
dimension geography. We can create a view to make that easier by using following code:

CREATE VIEW [dbo].[Total_DimTerritoryAndGeography]


AS
SELECT dbo.DimGeography.City,
dbo.DimGeography.StateProvinceCode,
dbo.DimGeography.StateProvinceName,
dbo.DimGeography.CountryRegionCode,
dbo.DimGeography.EnglishCountryRegionName,
dbo.DimSalesTerritory.SalesTerritoryGroup,
dbo.DimSalesTerritory.SalesTerritoryRegion FROM
dbo.DimGeography INNER JOIN dbo.DimSalesTerritory ON
dbo.DimGeography.SalesTerritoryKey =
dbo.DimSalesTerritory.SalesTerritoryKey

This will create a view that references the most interesting fields for both tables, the city, state, country
from the geography table and also the salesterritory group and sales territory region from the territory table.
The connection between the two is very simple. It is the sales territory key field, which exists in both tables.
So run this it may says command completed successfully, so now check view under list of views, if not
appear refresh view. Right click newly created view Total_DimTerritoryAndGeography view and select
top rows

27
Figure 3.2: Result of query on view created

Now see city, state, territory group and territory region all on one line. This should make it easier and more
convenient for developers to look at both of these dimensions at the same time.
The other scenario where views can improve the implementation of our data warehouse comes with
aggregating data. So, information like sales, profit, revenue, expense. Usually we don't want to look at
those things line by line, we want to look at totals.
And maybe it's a total for a week, or a total for a month, or a total for a year but there's probably going to
be some sort of grouping. So, creating a view with a group by clause can help us get started in that
aggregation.
So, again, below is some code staged that's going to create a view.

CREATE VIEW [dbo].[Total_FactInternetSales]


AS
SELECT SUM(DiscountAmount) AS Total_DiscountAmount,
SUM(ProductStandardCost) AS Total_ProductStandardCost,
SUM(TotalProductCost) AS Total_TotalProductCost,
SUM(SalesAmount) AS Total_SalesAmount,
OrderDate,
CustomerKey,
CurrencyKey
FROM dbo.FactInternetSales
GROUP BY OrderDate, CustomerKey, CurrencyKey

Its going to sum four different fields. All of them are dealing with currency, discount amount, product cost,
total product cost, and sales amount and group by three different fields, the auto date, the customer key,
and the currency key. So, that will allow to run reports on a certain time frame, and or on a customer or
group of customers, and or in certain currencies.

28
Let's run and create this view by executing the code. Select top rows from it and now we can see the data
that was returned. This is now a convenience for developers, they no longer have to manually set up the
group by fields, they can just pull off of this view.

Figure 3.3: Result of query on view created


Activity 2:

Adding Index to views:

This implementation however would not provide a performance increase. If we want to provide a
performance increase, we'll need to do is attach an index to this view. The way the view is now with no
indexes, it does not create a new copy of the data. And every time we select from the view, it goes to the
tables, pulls the data, performs all the necessary math, and then displays the data to the user. If we had an
index to the view, that forces the machine to create a secondary copy of the data that's already aggregated.
So, now when someone wants to look at this view, it doesn't go back to the tables and do all the math again.
It will just look directly at the view with the pre-aggregated totals. So, that would be possibly a significant
performance increase on many of our queries but, do be aware it could be a performance decrease when we
have to load new data. Because every time we load new data into this data warehouse, we are going to have
to update the view. The machine will handle that automatically but it will increase the time it takes to load
data.
ALTER VIEW [dbo].[Total_FactInternetSales]
WITH SCHEMABINDING
AS
SELECT SUM(DiscountAmount) AS Total_DiscountAmount,
SUM(ProductStandardCost) AS Total_ProductStandardCost,
SUM(TotalProductCost) AS Total_TotalProductCost,
SUM(SalesAmount) AS Total_SalesAmount,
OrderDate,
CustomerKey,
CurrencyKey,
COUNT_BIG(*) as RecordCount

FROM dbo.FactInternetSales
GROUP BY OrderDate, CustomerKey, CurrencyKey

29
Two changes are added to view:
schemabinding which creates a relationship between view and the underlying tables preventing a change
from one, without changing the other.
COUNT_BIG count the number of records andthis is a requirement in order to add an index onto any view
that has a GROUP BY.
execute this and see command completed successfully. Now add an index to this view using following
query:
CREATE UNIQUE CLUSTERED INDEX [IX_Total_FactInternetSales]
ON [dbo].[Total_FactInternetSales] (OrderDate, CustomerKey, CurrencyKey)

Its create a unique clustered index because the first index on a view has to be a unique clustered index.
After that you can create non-clustered. Here it indexed on order date, customer key and currency key .
Execute that and now over in the views we can expand it, look at the indexes, and we do in fact see one
index coming up under view.

Figure 3.4: Viewing index created on Total_FactInternetSale

Index is added on view. In the background the machine had made a secondary copy of the data. So, we still
have the original data in the original tables. We also have a secondary copy of that data in the view that is
pre-aggregated. So, if we want to make a call to this view, the machine doesn't have to do all of the math
to do the group by, it just has to read the data that it has aggregated. That can be a significant performance
increase to us.
3) Graded Lab Tasks (Allocated Time 1 Hr.)
Note: The instructor can design graded lab activities according to the level of difficult and complexity of
the solved lab activities. The lab tasks assigned by the instructor should be evaluated in the same lab.

30
Lab Task 1
Create a view on sales amount greater than 25000 along with initial name hire date and title of employee
handling the sale, day of month using FactSalesQuota, DimEmployee, DimDate. Also create the index on
the view.
You need to submit the report with screen shorts of results and queries of views and indexes.

31
Lab 04
Conversion of Entity Relationship Diagram(ERD) to
Dimensional Model (DM): Working with Sample
database Sakila

Objective:
The objective of this lab to help students to learn about creating Dimesional Model (DM) from Entity
Relationship Diagram (ERD) or OLTP database (using sample database).
Activity Outcomes:
The activities provide hands - on practice with the following topics
• Identify business processes from OLTP systems.
• Created DM from ERD.
Instructor Note:
As pre-lab activity, read chapter1 from the text book “The Data Warehouse Toolkit: The Definitive Guide
to Dimensional Modeling, Ralph Kimball & Margy Ross, Wiley, 2013”.

1) Useful Concepts

Introduction to sample database (The Sakila example database)


The Sakila database is a nicely normalised database modelling a DVD rental store (The Sakila example
database was originally developed by Mike Hillyer of the MySQL AB documentation team. it was ported
to other databases by DB Software Laboratory). Its design includes a few nice features:

• Many to many relationships


• Multiple paths between entities (e.g. film-inventory-rental-payment vs film-inventory-store-
customer-payment) to practice joins
• Consistent naming of columns
o Primary keys are called [tablename]_[id]
o Foreign keys are called like their referenced primary key, if possible. This allows for
using JOIN .. USING syntax where supported
o Relationship tables do not have any surrogate keys but use composite primary keys
o Every table has a last_update audit column
o A generated data set of a reasonable size is available

ERD

32
Figure 4.1 : ERD of DVD Rental database

• actor — contains actors data including first name and last name.
• film — contains films data such as title, release year, length, rating, etc.
• film_actor — contains the relationships between films and actors.
• category — contains film’s categories data.
• film_category — containing the relationships between films and categories.
• store — contains the store data including manager staff and address.
• inventory — stores inventory data.
• rental — stores rental data.
• payment — stores customer’s payments.
• staff — stores staff data.
• customer — stores customer’s data.
• address — stores address data for staff and customers
• city — stores the city names.
• country — stores the country names.

33
Sample Queries
1. Actor with most films (ignoring ties)
SELECT first_name, last_name, count(*) films
FROM actor AS a
JOIN film_actor AS fa USING (actor_id)
GROUP BY actor_id, first_name, last_name
ORDER BY films DESC
LIMIT 1;

Result:
first_name last_name films
--------------------------------
GINA DEGENERES 42

2. Cumulative revenue of all stores


SELECT payment_date, amount, sum(amount) OVER (ORDER BY payment_date)
FROM (
SELECT CAST(payment_date AS DATE) AS payment_date, SUM(amount) AS amount
FROM payment
GROUP BY CAST(payment_date AS DATE)
)p
ORDER BY payment_date;

Result:
payment_date amount sum
-------------------------------------
2005-05-24 29.92 29.92
2005-05-25 573.63 603.55
2005-05-26 754.26 1357.81
2005-05-27 685.33 2043.14
2005-05-28 804.04 2847.18
2005-05-29 648.46 3495.64
2005-05-30 628.42 4124.06
2005-05-31 700.37 4824.43
2005-06-14 57.84 4882.27
2005-06-15 1376.52 6258.79
2005-06-16 1349.76 7608.55
2005-06-17 1332.75 8941.30

Analysis Queries:
1. What is the number of rentals per month for each store?
SELECT s.store_id,
EXTRACT(ISOYEAR FROM r.rental_date) AS rental_year,
EXTRACT(MONTH FROM r.rental_date) AS rental_month,
COUNT(r.rental_id) AS count_rentals
FROM rental r

34
JOIN staff
USING (staff_id)
JOIN store s
USING (store_id)
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3;

Figure 4.2 : Number of rentals on each store

The plot here show a comparison between the 2 stores for each month. And as we can see there is no
significant defference between the 2 stores.

2) Solved Lab Activites

Sr. No Allocated Time Level of Complexity CLO Mapping

Activity 1 30 minutes Medium CLO-5

Activity 2 30 minutes Medium CLO-5

Activity 1: Designing business questions

The first step in designing a data warehouse for the sakila DVD rental database is to determine what
questions the Business would like to answer. For example, the Business may wish to know answers to the
following questions.

35
• Which store has the most rentals?
• Which district has the most rentals?
• Which week of the month has the highest volume of rentals?
• Which week of the year has the highest volume of rentals?
• Is the rental business growing, month over month? and year over year?
• What time of the day is the most active for DVD returns?
• Do rentals decrease when the film duration exceeds a certain time?
• What staff member has the most rentals? the least rentals?
• What movie category type (e.g. genre) results in the most rentals? Note that this is not addressed in
the sample data warehouse schema that was designed for this project.

Activity 2: Designing Dimensional Model

Once the Business needs are identified, then design of the data warehouse can proceed. Next step is to
define what will be the facts (i.e. measures) and what will be the dimensions (i.e. aspects of the business
process) to be tracked.
The facts for a business are typically sales, cost, inventory, and units sold. Dimensions will typically answer
the questions: what, where, who, and when. For a DVD rental business, the dimensions would typically be
product (i.e., the what), store (i.e., the where), customer (i.e., the who), sales person (again, the who), date
and time (i.e. the when).
For our data warehouse, the following is defined for the fact and dimensions:

• Facts: units rented and units returned


• Dimensions: product (i.e. film), store, customer, staff, date and time

The data warehouse schema that will be utilized to answer the Business questions is shown in the figure
below:

36
Figure 4.3 : Dimension Model (Star Schema) of DVD Rental data warehouse

3) Graded Lab Tasks


Note: The instructor can design graded lab activities according to the level of difficult and complexity of
the solved lab activities. The lab tasks assigned by the instructor should be evaluated in the same lab.

Lab Task 1
You are required to design a data warehouse for any existing database. Consider the following sample
database provided by MYSQL. You can download this sample database (Table and data) from the link:
Download MySQL Sample Database .

MySQL Sample Database Schema

The MySQL sample database schema consists of the following tables:

• Customers: stores customer’s data.


• Products: stores a list of scale model cars.
• ProductLines: stores a list of product line categories.
• Orders: stores sales orders placed by customers.

37
• OrderDetails: stores sales order line items for each sales order.
• Payments: stores payments made by customers based on their accounts.
• Employees: stores all employee information as well as the organization structure such as who
reports to whom.
• Offices: stores sales office data.

Submit the complete report after performing all following tasks.


1. You are required to formulate the business rules (questions) according to the sample data set ( like
we did in solved lab task part).
2. Add the description of current database and then convert it to the data warehouse schema based on
business questions you designed in task 1.
3. Mension the details how you select fact and dimension tables.
4. Show the data warehouse schema in the form of figure.

Note: Try to take different sample database. If your sample database is same as others then your
business rules should vary with respect to other class fellows.

38
Lab 05
Demonstration of ETL Tool: SQL Server Integration
services-SSIS (Extraction and Loading)

Objective:
The objective of this lab to help students to work with SSIS for successful data transofrmtion and loading
into data warehouse from various type of data sources. This lab will help to load data from CSV file to
database/data warehouse table.
Activity Outcomes:
The activities provide hands - on practice with the following topics
• Install SSIS
• Load data from CSV file to database table.
Instructor Note:
As pre-lab activity, read chapter 3 from the text book “The Data Warehouse ETL Toolkit: Practical
Techniques for Extracting, Cleaning, Conforming, and Delivering Data, Ralph Kimball & Joe Caserta,
Wiley, 2004”.

1) Useful Concepts
What is SSIS:
SQL Server Integration Service (SSIS) is a component of the Microsoft SQL Server database software that
can be used to conduct a wide range of data integration tasks. SSIS is a fast & flexible data warehousing
tool used for data extraction, loading and transformation like cleaning, aggregating, merging data, etc.
It makes it easy to move data from one database to another database. SSIS can extract data from a wide
variety of sources like SQL Server databases, Excel files, Oracle and DB2 databases, etc.
SSIS also includes graphical tools & wizards for performing workflow functions like sending email
messages, FTP operations, data sources, and destinations.

Why we use SSIS:

Here, are key reasons for using SSIS tool:

• SSIS tool helps you to merge data from various data stores
• Automates Administrative Functions and Data Loading
• Populates Data Marts & Data Warehouses
• Helps you to clean and standardize data
• Building BI into a Data Transformation Process
• Automating Administrative Functions and Data Loading
• SIS contains a GUI that helps users to transform data easily rather than writing large programs
• It can load millions of rows from one data source to another in very few minutes
• Identifying, capturing, and processing data changes

39
• Coordinating data maintenance, processing, or analysis
• SSIS eliminates the need of hardcore programmers
• SSIS offers robust error and event handling

Components of SSIS Architecture:

• Control Flow (Stores containers and Tasks)


• Data Flow (Source, Destination, Transformations)
• Event Handler (sending of messages, Emails)
• Package Explorer (Offers a single view for all in package)
• Parameters (User Interaction)

2) Solved Lab Activites


Sr. No Allocated Time Level of Complexity CLO Mapping

Activity 1 30 minutes (May vary due to Low CLO-5


system and internet speed)
Activity 2 30 minutes Medium CLO-5

Activity 1:

Adding the SSIS Projects extension to the Visual Studio 2019

When Visual Studio is opened, we click on "Continue without code" to add the necessary
extension:

40
Figure 5.1: SSIS Installation Step one

In this window, we click on "Extensions" > "Manage Extensions":

Figure 5.2: SSIS Installation Step two

In the search bar of the opened window, we type "Integration Services" to easily locate the extension.
From the appeared list we choose "SQL Server Integration Services Projects" and press "Download":

41
Figure 5.3: SSIS Installation Step three

Then, we will execute the downloaded .exe file:

Figure 5.4: SSIS Installation Step four

The installation of the extension begins. Now, we will follow some simple steps. In the next window we
click "OK":

Figure 5.5: SSIS Installation Step five

42
After that, we click "Next" to continue:

Figure 5.6: SSIS Installation Step six

If you receive the following message, you probably have SQL Server Management Studio opened:

Figure 5.7: SSIS Installation (Frequently occurring error)

Close it and click "OK". The process should continue:

43
Figure 5.8: SSIS Installation progress bar

Finally, the setup is completed and we have our extension installed:

Figure 5.9: SSIS Installation completed

Now, we are ready to create Integration Services projects. In Visual Studio, we choose "Create a new
project":

44
Figure 5.10: Integration service project creation step one

In the next window, we type "integration" to find "Integration Services Project" and click on it:

Figure 5.11: Integration service project creation step two

We choose a name for our project:

45
Figure 5.12: Integration service project creation ( adding project name)

Hence, it is ready! We opened the interface where we can design and develop SSIS 2019 packages:

Figure 5.13: Integration service project window

Activity 2:

Import CSV File into Database:

First, you need to prepare the environment by creating the SQL Server table and the CSV file.

46
Run the script below in SQL Server to create the SQL table either on a new database or an existing one.
For this example, I used my ‘TrainingDB’ database.

/* Creates table for Students.csv */


CREATE TABLE StudentDetails
(
Surname varchar(50),
Firstname varchar(50),
DateofBirth datetime,
PostCode varchar(50),
PhoneNumber varchar(50),
EmailAddress varchar(50)
)

Now create a CSV file with the data below. Open notepad file add headings separated with comma (,) and
add each record as new line (use comma between each value in a row). Save the notepad file as CSV.

Surname Firstname DOB Postcode PhoneNo EmailAddress


Bonga Fred 24-02-1990 SA1 5XR 08100900647 [email protected]
Smith Gill 08-05-1992 RMT 12TY 08200900793 [email protected]
Taylor Jane 01-12-1979 PM2E 3NG 09600900061 [email protected]
Brown John 06-10-1986 CQ7 1JK 08200900063 [email protected]
Cox Sam 18-03-1982 STR3 9KL 08100900349 [email protected]
Lewis Mark 30-09-1975 DN28 2UR 08000900200 [email protected]
Kaur Ahmed 26-07-1984 NI12 8EJ 09500900090 [email protected]

After launching Microsoft Visual Studio, navigate to File - New - Project, as shown below.

47
Figure 5.14: Create new project

Under the Business Intelligence group, select Integration Services and Integration Services Project. Enter
a name for project and a name for the solution, for example “Load CSV”. You can check the “Create a
directory for solution” box if you want to create a solution.

48
Figure 5.15: Select integration services project

Click OK

On the right side of the displayed screen, in the “Solution Explorer” window, change the name of the default
package to “Load CSV File into Table”

49
Figure 5.16: Rename SSIS Package

On the left side of the screen, in the SSIS Toolbar, drag the “Data Flow” to the “Control Flow” window
and rename the task to “Load CSV File”

Next, you need to setup the connection managers for both the CSV file and the SQL Server table, which
are also known as source and destination respectively. At the bottom of the screen, under Connection
Managers, do a right click and select “New Flat File Connection” and configure the Flat file connection
manager as shown below.

50
Figure 5.17: Flat file connection manager

Enter a suitable Connection manager name and specify the filepath for the Students.csv file. Click OK.

For the table’s connection manager, do a right click again in the Connection Managers window
and click on “New OLE DB Connection”. Click on New and specify the Server name and database
name that contains the StudentsDetail table.

51
Figure 5.18: Establish OLEDB connection

You can test the connection by clicking “Test Connection” then click OK and OK again. You should now
have the 2 Connection Managers at the bottom of the screen.
Drag the “Flat File Source” from the SSIS Toolbox into the “Data Flow” window and rename it as “CSV
File”.

52
Figure 5.19: Loading flat file

Double click on this source and select the “Student CSV File” connection manager. Click on Columns on
the left side of the screen to review the columns in the file. Click OK.

Then drag the “OLE DB Destination” from the SSIS Toolbox to the “Data Flow” window and rename it as
“SQL Table”. Drag the blue arrow from the source to the destination.

Double click on the destination and configure as shown below.

53
Figure 5.20: Connecting database table

Click on Mappings on the left side of the screen and ensure all fields are mapped correctly from source to
destination.

54
Figure 5.21: Mapping columns of source flat file to destination database table column

Click OK. Your screen should look like the image below.

55
Figure 5.22: Successful loading of data from flat file to database table

Run the package by clicking on Start. When the package finishes executing, you can check the table to view
the data from the CSV file.

Other examples to do the same task


https://ozanecare.com/import-csv-file-into-sql-server-table-using-ssis/

3) Graded Lab Tasks


Note: The instructor can design graded lab activities according to the level of difficult and complexity of
the solved lab activities. The lab tasks assigned by the instructor should be evaluated in the same lab.

56
You are required to design database Student Exam ( ERD is given belwo) and load data from
different CSV, HTML files into each table.

Figure 5.23: ERD of student Exam database

57
Lab 06
Demonstration of ETL tool : SQL Server Integration
services-SSIS (Transformation)

Objective:
The objective of this lab to help students to work with SSIS for successful data transofrmtion and loading
into data warehouse from various type of data sources. This lab will help to understand different type of
transformations which can be applied on data before loading into data warehouse.
Activity Outcomes:
The activities provide hands - on practice with the following topics
• Work with different data transmissions
• Create an SSIS package for data sampling in the SSIS package.
Instructor Note:
As pre-lab activity, read chapter 3 from the text book “The Data Warehouse ETL Toolkit: Practical
Techniques for Extracting, Cleaning, Conforming, and Delivering Data, Ralph Kimball & Joe Caserta,
Wiley, 2004”.

1) Useful Concepts
SSIS Transformation:

The SSIS transformations are the data flow components that are used to perform aggregations, sorting,
merging, modifying, joining, data cleansing, and distributing the data.
Apart from these, there is an important and powerful transformation in SSIS called Lookup transformation
to perform lookup operations. In this article, we will show you the list of available SSIS transformations
and explains their working functionality.
Business Intelligence Transformations in SSIS:
The following list of SSIS transformations will perform Business Intelligence operations such as Data
Mining, Correcting, and cleaning the data.

58
Row Transformation in SSIS:

The below list of SSIS transformations is useful to update the existing column values and to create new
columns.

59
Rowset Transformations:

The following transformations create new rowsets. The rowset can include aggregate and sorted values,
sample rowsets, or pivoted and unpivoted rowsets.

60
Split and Join Transformations:
The following transformations distribute rows to different outputs, create copies of the transformation
inputs, join multiple inputs into one output, and perform lookup operations.

Auditing Transformations:
Integration Services includes the following transformations to add audit information and count rows.

61
2) Solved Lab Activites

Sr. No Allocated Time Level of Complexity CLO Mapping

Activity 1 30 minutes Medium CLO-5

Activity 2 30 minutes Medium CLO-5

Activity 1: Creating SSIS package for data sampling

Let’s create an SSIS package for data sampling in the SSIS package.

The following table in AdventureWorks database contains 19820 rows.

SELECT [CustomerID]
,[PersonID]
,[StoreID]
,[TerritoryID]
,[AccountNumber]
,[rowguid]
,[ModifiedDate]
FROM [adventureworks2014].[Sales].[Customer]

In the SSIS Control flow window, add a data flow task and rename it to Data Sampling Transformation
in SSIS.

Figure 6.1: Data sampling transformation step one

Right-click on this data flow task and Edit. It takes you to data flow page. In this page, you can see that we
are in this particular data flow task.

62
Figure 6.2: Data sampling transformation step two

Add an OLE DB source and rename the task as appropriate. This task should point to SQL instance and
Sales.Customers table in AdventureWorks2019 database.

Figure 6.3: Data sampling transformation step three

Right-click on a blank area in the data flow task and click on Add Annotation. Annotation is similar to a
text box that does not execute, and we use to print messages that help to understand the SSIS package.

I specified 19820 rows in this annotation box.

Figure 6.4: Data sampling transformation step four

We have prepared the base of the SSIS package in this step. Let’s move forward with data sampling
transformations in SSIS.

Row Sampling Transformation in SSIS:

We use Row sampling transformation to retrieve a specified random number of data rows from the source
data table. It gives random data every time we execute the SSIS package. You get two outputs from this
transformation.

1. Random data based on a specified number of rows


2. Rest of the data that is not available in step 1

Add a Row Sampling transformation from the SSIS toolbox and drag the blue arrow from source to
transformation, as shown below.

63
Figure 6.5: Row sampling transformation step one

Double click on Row Sampling, and it opens the row sampling transformation editor.

Figure 6.6: Row sampling transformation Editor

We have the following options in this editor window.

• The number of rows: We can specify the number of random rows we want from the
transformation. The default value is 10. Let’s modify it to 1000 rows
• Sample Output name: It shows the name of the output that we get from the transformation as
specified by the number of rows parameter. The default name is Sampling Selected Output. Let’s
modify it to Row Sampling Match output
• Unselected output name: It is the name of the output that contains data excluded from row
transformation. We get the total number of rows in the table – the number of rows specified in this
output. Let’s modify the name to Excluded data

64
Figure 6.7: Row sampling transformation editor (changing name of Unselected output)

Let’s skip the option ‘User the following random seed’ as of now. We will cover it in the latter part of the
lab.

Click on Columns, and it shows all available source columns.

Figure 6.8: Row sampling transformation Editor ( viewing source column)

65
Now, add two SSIS Multicast transformations and rename them as follows.

• Multicast – Matched
• Multicast – Unmatched

Figure 6.9: SSIS multicast transformation

Join the output from Row Sampling to Multicast – Matched, and it opens the input-output selection window.
In the output column, select the output – Row Sampling Match output.

Figure 6.10: Joining Input/Output to multicast

Similarly, take the second output from Row Sampling and join to Multicast – unmatched transformation. It
will automatically take another available output, as shown below.

We added SSIS Multicast operator here to display the data. If you want to insert data into SQL table,
configure OLE DB destination as well.

66
Figure 6.11: Splitting matched and unmatched data

Right-click on the arrow between Row Sampling and Multicast- Matched and enable data viewer.

Figure 6.12: Enabling data viewer

It shows the following symbol on the arrow.

67
Figure 6.12: Data viewer sambol representation on Row sampling match

Press F5 to execute the SSIS package. It opens the data viewer, and you can see the 1000 rows in the output.

Figure 6.13: Viewing data using data viewer

Close this data viewer and package execution complete. In the output, we can see that

• Multicast – Matched gets 1000 rows


• Multicast – Unmatched gets 18,820 rows

68
Figure 6.14: Viewing no. of match and unmatch rows

Let’s do the following configurations in the data sampling

• Number of rows: 10
• Use single-column AccountNumber

Execute the SSIS package twice and note the output.

First execution:

Figure 6.14: Raw sampling output data viewer after first execution

69
Second execution:

Figure 6.15: Raw sampling output data viewer after second execution

You can compare the output in both the executions. In both the executions, it gives random account numbers
and it different in both executions. It might also pick certain account number again in second execution in
random pick up.

Suppose we want to get similar records on each execution. It should give us the output as per specified
record count, but records should not be random.

In the configured SSIS package, open the properties again for row sampling transformation and set the
random seed value to 1. It is recommended only for testing purpose.

70
Figure 6.16: Setting new random seed value

Execute the package again twice and observe the output.

First execution

Figure 6.17: Raw sampling output data viewer after first execution with new seed value

71
Second execution

Figure 6.18: Raw sampling output data viewer after second execution with new seed value

You get same data in both the executions. It picks the random data at once and does not change data in the
next execution.

Activity 2:

Percentage Sampling Transformation in SSIS:

In the previous section, we discussed the Row Sampling Transformation in SSIS. Percentage sampling
configuration is similar to row sampling.

In row sampling, we specify the number of rows that we want in the output, such as 500 rows or 1000 rows.
I Percentage sampling, we specify the percentage of rows. For example, if total incoming rows are 1000
and we specify 10% percentage sample, we get approximately 100 rows in the matched output. Remaining
rows get to unmatched row output.

Similar to row sampling transformation, it picks random sampling data, and you might get a completely
different result set on each execution. We can specify random seed value to get similar data on each
execution.

Let’s do the configuration for Percentage sampling transformation in SSIS package.

• Remove the row sampling and underlying multicast operators

72
• Drag a percentage sampling from the SSIS toolbox and join arrow between source data and
percentage sampling
• In percentage sampling, specify the percentage of rows, output column names

Figure 6.19: Percentage sampling transformation editor

• Add two SSIS Multicast transformations


• First Multicast transformation gets the desired percentage sample of rows
• Other Multicast transformation gets unmatched rows

73
Figure 6.20: Setting up percentage sampling transformation

Execute the SSIS package. In the following screenshot, we can see that percentage sampling
transformation in SSIS does the following tasks

• Total number of rows in source table – 19820


• Specify percentage sampling – 5%
• First, Multicast gets 1015 rows that are approximately 5% of the sample data

Figure 6.21: No. of rows after setting percentage sampling transformation

74
Let’s specify the random seed value 1 in the percentage sampling transformation and execute the
SSIS package.

Figure 6.22: Setting new random seed value for percentage sampling transformation

In both executions, it gets similar account number in the output.

75
First execution

Figure 6.23: Percentage sampling output data viewer after first execution with new seed value

Second execution

Figure 6.24: Percentage sampling output data viewer after second execution with new seed value

76
Conclusion:

In this lab, we explored data sampling technique – Row Sampling transformation and Percentage Sampling
transformation in SSIS package. You can use these transformations to test package against a different set
of data and analyze results.

3) Graded Lab Tasks


Note: The instructor can design graded lab activities according to the level of difficult and complexity
of the solved lab activities. The lab tasks assigned by the instructor should be evaluated in the same
lab.

1. When loading data into SQL Server you have the option of using SQL Server Integration Services
to handle more complex loading and data transforms then just doing a straight load. One problem
that you may be faced with is that data is given to you in multiple files such as sales and sales
orders, but the loading process requires you to join these flat files during the load instead of doing
a preload and then later merging the data. Using merge join SSIS transformation, merge multiple
data sources and load the data into table. You can use employee table for data loading from any
two source files (cvs or html).
2. Using derived column transformation, add a new column to the table before loading data to table.
The new column will calculate the annual salary of each employee against the monthly salary (
consider the employee table for transformation).

77
Lab 07
Creating ROLAP Cube in SQL Server Analysis
Services (SSAS)
Objective:
The objective of this lab is to learn to design ROLAP Cube using SSAS (SQL Server Analysis Services).
Activity Outcomes:
The activities provide hands - on practice with the following topics
• Create ROLAP cube in data warehouse.
Instructor Note:
As pre-lab activity, read chapter 14 from the text book “Data Mining and Data Warehousing: Principles
and Practical Techniques, Parteek Bhatia, Cambridge University Press, 2019”.

1) Useful Concepts
Introduction to OLAP Cube

• An OLAP cube is a technology that stores data in an optimized way to provide a quick response to
various types of complex queries by using dimensions and measures.
• Most cubes store pre-aggregates of the measures with its special storage structure to provide quick
response to queries.
• SSRS Reports and Excel Power Pivot is used as front end for Reporting and data analysis with SSAS
(SQL Server Analysis Services) OLAP Cube.
• SSAS (SQL Server Analysis Services) is Microsoft BI Tool for creating Online Analytical Processing
and data mining functionality.

What is SSAS?

SSAS stands for SQL Server Analytics Services and is simply a data analysis tool by Microsoft. Using
SSAS, you can analyze data coming from various sources and produce some summaries of useful
information.
Using SSAS you can create two types of models;
Tabular Model – this is just a kind of database. Or say a database that is a bit more advanced that normal
relational database. A tabular model are in-memory databases that uses table and other relational
components such as relationships, joins, rows and column etc
Multidimensional Model – this is a data model that support very large amount of data. Similar to but not
exactly big data. Multidimensional data model in addition to tabular data supports other things like
dimensions, measures perspectives, data sources and perspectives.

78
2) Solved Lab activities

Sr. No Allocated Time Level of Complexity CLO Mapping

Activity 1 60 minutes (Time can vary due to Low CLO-5


internet and system speed)
Activity 2 30 minutes Medium CLO-5

Activity 1:
Installing SQL Server 2019 Analytics Services (SSAS)

Some Prerequisites
SQL Server installed as well as SQL Server Management Studio.

Setup SSAS
We would follow the steps below to setup SSAS

Step 1 – Run the SQL Server setup. You will come to the screen below:

Figure 7.1: Installing SSAS

Step 2 – Click on the first link ‘New SQL Server stand-alone installation or add features to an existing
installation’.
Step 3 – Follow the Wizard steps. And when it comes to the Feature Selection page, select Analysis
Services as shown below:

79
Figure 7.2: Choose Analysis services

Step 4 – When you get to the Analysis Services Configuration, choose Tabular Mode as shown below

Figure 7.3: Choose Tabular Mode

Step 5 – Follow the steps and complete the installation.

Install SQL Server Data Tools

You need to install SQL Server Data Tools (SSDT). According to Microsoft, this tool has been integrated
into Visual Studio. So following the steps below, you’ll be taken to where you’ll download and install
Visual Studio 2019.
Step 1 – Follow the same process but select SQL Server Data Tools. See figure below:

80
Figure 7.4: Installing SQL Server Data Tools

Step 2 – Click on Install SQL Server Data Tools. You’ll be taken to the download page for Visual Studio.
Install it (We already installed it in Lab 1).

Setup Analysis Services Tabular Project

Activity 2:
OLAP Cube Creation
Create a database called “Testing” and run the following query to create tables:
create table Dim_Customer(
customerid int primary key,
name varchar(50)
)
create table Dim_Product(
productid int primary key,
name varchar(50),
price int
)
create table Fact_Sales(
salesid int primary key,
customerid int foreign key references Dim_Customer(customerid),
productid int foreign key references Dim_Product(productid),
salesTotal int
)

insert into Dim_Customer(customerid, name) values (1, 'Danyal');


insert into Dim_Product(productid, name, price) values (2, 'Car', 200);
insert into Fact_Sales(salesid, customerid, productid, salesTotal) values (3, 1, 2, 20);

81
Create new analysis service project in Microsoft Visual Studio –

Figure 7.11: Creating new analysis service project

Create new data source by right-click on Data Sources in Solution Explorer-

82
83
84
Note: Use your windows username and password.

85
Create new cube by right click on Cubes in Solution Explorer-

86
87
88
89
90
Deploying:

Go to project properties and enter your server name from SSMS and database name as “Testing”.

Click on start to start the deployment.

91
Once deployment is successful right click the salescube and click on process to process the cube.

92
3) Graded Lab Tasks
Note: The instructor can design graded lab activities according to the level of difficult and complexity of
the solved lab activities. The lab tasks assigned by the instructor should be evaluated in the same lab.

93
Lab Task 1
Star schema for sales DWH is given below. You are required to create this Data warehouse. Queries for
table creation and data loading can be found on the link:
https://www.codeproject.com/Articles/652108/Create-First-Data-WareHouse

For the sales data warehouse create the OLAP cube.

94
Lab 08

Get Started with Power BI Desktop


Objective:
The objective of this lab is to teach the students about the data visualization through Power BI with the help of examples
and learning tasks.

Activity Outcomes:
The activities provide hands - on practice with the following topics
• Get familiar with PowerBI basic ribbon and operations.
• Load data fron CSV files
• Remove NULLs from Data
• Create basic visualization

Instructor Note:
As pre-lab activity, read chapter 14 from the textbook “Business Intelligence Guidebook: From Data
Integration to Analytics, Rick Sherman, Morgan Kaufmann Press, 2014”.

95
1) Useful Concepts

Introducing Power BI
Power BI is a suite of business analytics tools which connects to different data sources to analyze data and share
insights throughout your organization.

Parts of Power BI

There are 3 Parts of Power BI.


1. Power BI Desktop
2. Power BI Service
3. Power BI Mobile

Power BI Desktop: It is a Windows desktop application (Report Authoring Tool) which Lets you build queries,
models and reports that visualize data.

Power BI Service: Power BI Service is cloud based Software as Service Application which allows us to create
dashboards, Setup schedule data refreshes, Share the reports securely in the organization.

Power BI Mobile: It is an application (App) on mobile devices which allows you to interact with the reports and
dashboard from Power BI Service.

The flow of work in Power BI


A common flow of work in Power BI begins in Power BI Desktop, where a report is created. That report is then
published to the Power BI service, and then shared so users of Power BI Mobile apps can consume the
information.

It doesn’t always happen that way, and that’s okay, but we’ll use that flow to help you learn the various parts of

96
Power BI, and how they complement one another.

Power BI Desktop:
Power BI Desktop is report authoring tool that allows you to create reports, queries, Extract Transform and Load
the data from data sources and model the queries.

97
Power BI Desktop Interface: The Report has five main areas:

1. Ribbon: The Ribbon displays common tasks associated with reports and visualizations;
2. Pages: The Pages tab area along the bottom allows you to select or add a report page;
3. Visualizations: The Visualizations pane allows you to change visualizations, customize colors or axes,
apply filters, drag fields, and more;
4. Fields: The Fields pane, allows you to drag and drop query elements and filters onto the Report view, or
drag to the Filters area of the Visualizations pane;
5. Views Pane: There are three types of views in the views pane
▪ Reports View – allows you to create any number of report pages with visualizations.
▪ Data View – allows you to inspect, explore, and understand data in your Power BI Desktop model.
▪ Relationship or Model view – allows you to show all of the tables, columns, and relationships in your
model.

4) Solved Lab Activites


Sr.No Allocated Time Level of Complexity CLO Mapping
1 10 Low CLO-6
2 10 Low CLO-6
3 20 Medium CLO-6
4 10 Low CLO-6

98
Activity 1:
Querying Data from CSV

Solution:

Query Editor
You can import and clean data from Oracle while working in Power BI.
Query Editor, allows you to connect to one or many data sources, shape and transform the data to meet your
business needs, then load the queries into the model into Power BI Desktop
This below step provides an overview of the work with data as well as connecting to data sources, shaping the
data in Query Editor

Get Started with Query Editor


1. To get to Query Editor, select Edit Queries from the Home tab of Power BI Desktop.

2. Click on the drop down of the Edit Queries on the bottom right corner, click on Edit Queries

99
Note: With no data connections, Query Editor appears as a blank pane, ready for data.

Below image shows the interface of the Query Editor

Connecting the data from the Excel Source


3. From Home tab > New Source > Choose Excel

100
4. Navigate to the Strategic Plan and Dashboard Folder and Choose
PowerBITraining_StrategicPlanDashboard_Input_Template Excel File

5. Click on Open ( ) at the bottom of the screen


You can see a navigator screen to select the sheets on the Excel Workbook. In our case, we have one sheet named
as Input

101
6. Select Input sheet from the available list

7. Click OK ( )at the bottom of the screen

Interface of Query Editor


Query Editor consists of 4 Parts
1. Query Ribbon

2. Left Pane

3. Center (Data) Pane

4. Query Settings

102
The Query Ribbon
The Ribbon in Query Editor consists of four tabs
▪ Home
▪ Transform
▪ Add Column
▪ View
Home Tab: The Home tab contains the common query tasks, including the first step in any query, which is Get
Data.

Transform: The Transform tab provides access to common data transformation tasks, such as adding or
removing columns, changing data types, splitting columns, and other data-driven tasks.

Add Column: The Add Column tab provides additional tasks associated with adding a column, formatting
column data, and adding custom columns. The following image shows the Add Column tab.

View Tab: The View tab on the ribbon is used to toggle whether certain panes or windows are displayed. It’s also
used to display the Advanced Editor. The following image shows the View tab.

The Left pane:

The left pane displays the number of active queries, as well as the name of the query. When you select a
query from the left pane, its data is displayed in the center pane, where you can shape and transform the
data to meet your needs.

103
The center (data) pane:
In the Center pane, or Data pane, data from the selected query is displayed. This is where much of the work of
the Query view is accomplished.

The Query settings pane:


The Query Settings pane is where all steps associated with a query are displayed.

104
Activity 2:

Clean, Transform the data (Removing Nulls)

Solution:
Removing the unwanted rows in the query.
8. Home Tab > Reduce Rows section > Remove Rows > Remove Blank Rows

Notice that null records are eliminated, and new steps is added for the transformation you applied to the query in
the query settings pane of the selected query.

105
Note: Each step, you do in the Query Editor is recorded in Applied Steps of Query Settings pane.

9. From Home Ribbon > Click on Close & Apply

Note: After Close & Apply the query is added to the model for report development.

106
Activity 3:
Creating Simple Reports & Visualizations

Solution:
Creating your first visualization (Completion % of All Goals) Gauge Chart
1. Click on Visualizations Pane and Click on Gauge Chart

Note: Make sure the Visualization is selected before dropping the fields.

107
2. Expand Input Query, Drag Overall Completion% to the Value section of the Fields pane of the gauge Visual

Importing a Theme to a Power BI Desktop File.

With Report Themes you can apply design changes to your entire report, such as using corporate colors,
changing icon sets, or applying new default visual formatting. When you apply a Report Theme, all visuals in
your report use the colors and formatting from your selected theme.

108
3. From the Home Ribbon of the Report view, click on the drop down of the Switch Theme under Themes section
and select Import from the file. Drag Overall Completion% to the Value section of the Fields pane ofthe gauge
Visual

A window appears that lets you browse to the location of the JSON theme file

4. Navigate to the Strategic Plan and Dashboard folder o the Desktop and select Power BI Color Theme.Json file

109
5. Click on Open ( ) at the bottom of the screen

You will get a success message once the theme is imported successfully.

Changing the Color of the Gauge.

6. Select the Gauge Chart and Click on the Format of the Gauge Chart, Expand Data Colors properties,
click on the drop down of Fill property and select light blue color

After the changing the color the gauge chart looks like the one below.

110
7. Click on the drop down of Target property and select Black color.

111
Changing the Title of the Gauge Chart.
8. Expand the title property of the Gauge chart, Change the title text to “Completion% of All 4 Goals”.

We are done with our first visualization. We will create few more visualizations.

112
Exercise 9: Creating the Stacked Column Chart.
9. Click anywhere on the Canvas other than the visuals, select Stacked Column Chart and bring the visual nextto
the Donut Chart.

10. Expand Input, Drag Overall Completion% to the Value section, Goal Detail to the Legend, Goal to the
Axis of the Fields pane of the Stacked Column Visual.

113
Notice that the goals are not in the right order.

Sorting the Goals in the right order.

11. Click on the ellipses ( More Options) of the Stacked Column Visual, Select Sort Ascending,
Hover on Sort by and Select Goal Detail.

12. Click on the format icon ( ) for the visual, Expand Title and edit the title to “Goal Completion% by Goal”

Notice that the Y axis is not 100%


114
13. Expand Y Axis property, In the End Box, Type in 1

14. Turn on the Data Labels Property.

115
15. Click anywhere on the Canvas other than the visuals, select Stacked Column Chart and bring the visual below
the Donut Chart.
16. Expand Input, Drag Overall Completion% to the Value section, Performance Measure/Milestone Detail to
the Axis, Champion to the tool tip of the Fields pane of the Stacked Column Visual.

Filters in Power BI
Filters allows the Power BI visual to narrow down or filter to the desired result. We are filtering the
visual to show just the data for Goal.
17. Expand the filters pane, Drag Goal to the “Add data fields here” section under Filters on this visual section

116
and select Goal 1

117
18. Click on the format icon ( ) for the Stacked Column Chart visual, expand Title and edit the title to Goal 1
Completion%

19. Turn on the Data Labels Property, Expand Y axis Property and in the End box Type 1

Adjust the height and width of the visual.

118
20. Click on the Stacked Column Chart visual and copy & paste it, Adjust the position on the Report page

Note: It is like MS word Copy (Ctrl + C) and Paste (Ctrl + V)

21. Click on the format icon ( ) for the Stacked Column chart visual, expand Title and edit the title to Goal2
Completion %

119
22. Expand the filters pane, click on the drop down of the Goal Filter on Filters Pane and select Goal 2

Notice that the Stacked Column Chart visual is automatically changed to the reflect the data to the Goal 2.

23. Click on the format icon ( ) for the Stacked Column chart visual, expand Data colors property, Change
the color to reflect the color for Goal 2 on the Goal Completion % by Goal.

120
24. Click on the Stacked Column Chart visual and copy & paste it, Adjust the position on the Report page

25. Click on the format icon ( ) for the Stacked Column chart visual, expand Title and edit the title to Goal3
Completion %

121
26. Expand the filters pane, click on the drop down of the Goal Filter on Filters Pane and select Goal 3

27. Click on the format icon ( ) for the Stacked Column chart visual, expand Data colors property, Change
the color to reflect the color for Goal 2 on the Goal Completion % by Goal.

122
28. Click on the Stacked Column Chart visual and copy & paste it, Adjust the position on the Report page

29. Click on the format icon ( ) for the Stacked Column chart visual, expand Title and edit the title to Goal3
Completion %

123
30. Expand the filters pane, click on the drop down of the Goal Filter on Filters Pane and select Goal 4

31. From the Home Ribbon, click on the Text Box and type in “Strategic Plan Dashboard” and increase
the fontsize to 21.

124
5) Graded Lab Tasks
Note: The instructor can design graded lab activities according to the level of difficult and complexity
of the solved lab activities. The lab tasks assigned by the instructor should be evaluated in the same
lab.
Lab Task 1
Download and install Power BI Desktop. Explore the Power BI Desktop interface. Import a dataset (from
any online site e.g., Kaggles, data.World etc.) into Power BI Desktop. Create different visualizations using
the imported data.

125
Lab 10

Preparing Data in Power BI Desktop

Objective:
The objective of this lab is to get connecting to source data, previewing the data, and using data preview
techniques to understand the characteristics and quality of the source data.

The activities provide hands - on practice with the following topics


• Set Power BI Desktop options
• Connect to source data
• Preview source data
• Use data preview techniques to better understand the data

Instructor Note:
As pre-lab activity, read chapter 14 from the text book “Business Intelligence Guidebook: From Data
Integration to Analytics, Rick Sherman, Morgan Kaufmann Press, 2014”.

126
1) Useful Concepts

2) Solved Lab Activites


Sr.No Allocated Time Level of Complexity CLO Mapping
1 10 Low CLO-6
2 10 Low CLO-6
3 20 Medium CLO-6
4 10 Low CLO-6

Activity 1:
Prepare Data

Solution:
In this exercise, you will create eight Power BI Desktop queries. Six queries will source data from SQL Server,
and two from CSV files.

Save the Power BI Desktop file


In this task, you will first save the Power BI Desktop file.
1. In Power BI Desktop, click the File ribbon tab to open the backstage view.
2. Select Save.

3. In the Save As window, navigate to the D:\DA100\MySolution folder.

4. In the File Name box, enter Sales Analysis.

127
5. Click Save.

Tip: You can also save the file by click the Save icon located at the top-right.

Set Power BI Desktop options


1. In Power BI Desktop, click the File ribbon tab to open the backstage view.
2. At the left, select Options and Settings, and then select Options.

3. In the Options window, at the left, in the Current File group, select Data Load.

128
The Data Load settings for the current file allow setting options that determine default behaviors when
modeling.

4. In the Relationships group, uncheck the two options that are checked.

While these two options can be helpful when developing a data model, they have been disabled to support
the lab experience. When you create relationships in Lab 03A, you will learn why you are adding each one.

5. Click OK.

6. Save the Power BI Desktop file.

Get data from SQL Server

In this task, you will create queries based on SQL Server tables.

129
1. On the Home ribbon tab, from inside the Data group, click SQL Server.

2. In the SQL Server Database window, in the Server box, enter localhost.

In the labs, you will connect to the SQL Server database by using localhost. This isn’t a recommended
practice, however, when creating your own solutions. It’s because gateway data sources cannot resolve
localhost.

3. Click OK.

4. Notice that the default authentication is to Use My Current Credentials.

130
5. Click Connect.

6. When prompted about encryption support, click OK.

7. In the Navigator window, at the left, expand the AdventureWorksDW2020 database.

The AdventureWorksDW2020 database is based on the AdventureWorksDW2017 sample database.


It has been modified to support the learning objectives of the course labs.

8. Select—but don’t check—the DimEmployee table.

9. In the right pane, notice a preview of the table.

131
The preview allows you to determine the columns and a sample of rows.

10. To create queries, check the following six tables:

o DimEmployee o
DimEmployeeSalesTerritory o
DimProduct o DimReseller o
DimSalesTerritory o
FactResellerSales

11. To apply transformations to the data of the selected tables, click Transform Data.

You won’t be transforming the data in this lab. The objectives of this lab are to explore and profile the
data in the Power Query Editor window.

Preview SQL Server queries

In this task, you will preview the data of the SQL Server queries. First, you will learn relevant information about
the data. You will also use column quality, column distribution, and column profile tools to understand the data,
and assess data quality.

1. In the Power Query Editor window, at the left, notice the Queries pane.

The Queries pane contains one query for each selected table.

2. Select the first query—DimEmployee.

132
The DimEmployee table stores one row for each employee. A subset of the rows represent the
salespeople, which will be relevant to the model you’ll develop.

3. At the bottom left, in the status bar, notice the table statistics—the table has 33 columns, and 296 rows.

4. In the data preview pane, scroll horizontally to review all columns.

5. Notice that the last five columns contain Table or Value links.

These five columns represent relationships to other tables in the database. They can be used to join
tables together.

6. To assess column quality, on the View ribbon tab, from inside the Data Preview group, check Column
Quality.

Column quality allows you to easily determine the percentage of valid, error, or empty values.

7. For the Position column (sixth last column), notice that 94% of rows are empty (null).

8. To assess column distribution, on the View ribbon tab, from inside the Data Preview group, check
Column Distribution.

133
9. Review the Position column again, and notice that there are four distinct values, and one unique value.

10. Review the column distribution for the EmployeeKey (first) column—there are 296 distinct values, and
296 unique values.

When the distinct and unique counts are the same, it means the column contains unique values. When
modeling, it’s important that some tables contain unique columns.

11. In the Queries pane, select the DimEmployeeSalesTerritory query.

The DimEmployeeSalesTerritory table stores one row for each employee and the sales territory regions
they manage. The table supports relating many regions to a single employee. Some employees manage
one, two, or possibly more regions. When you model this data, you will need to define a many-to-many
relationship, which you will do in Lab 05A.

12. In the Queries pane, select the DimProduct query.

134
The DimProduct table contains one row per product sold by the company.

13. Horizontally scroll to reveal the last columns.

14. Notice the DimProductSubcategory column.

When you add transformations to this query in the next lab, you’ll use the
DimProductSubcategory column to join tables.

15. In the Queries pane, select the DimReseller query.

The DimReseller table contains one row per reseller. Resellers sell, distribute, or value add Adventure
Works’ products.

16. To view column values, on the View ribbon tab, from inside the Data Preview group, check Column
Profile.

135
17. Select the BusinessType column header.

18. Notice that a new pane opens beneath the data preview pane.

19. Review the column statistics and value distribution.

20. Notice the data quality issue: there are two labels for warehouse (Warehouse, and the misspelled Ware
House).

21. Hover the cursor over the Ware House bar, and notice that there are five rows with this value.

In the next lab, you will apply a transformation to relabel these five rows.

22. In the Queries pane, select the DimSalesTerritory query.

The DimSalesTerritory table contains one row per sales region, including Corporate HQ
(headquarters). Regions are assigned to a country, and countries are assigned to groups. In Lab 04A,
will create a hierarchy to support analysis at region, country, or group level.

23. In the Queries pane, select the FactResellerSales query.

136
The FactResellerSales table contains one row per sales order line—a sales order contains one or more
line items.

24. Review the column quality for the TotalProductCost column, and notice that 8% of the rows are empty.

Missing TotalProductCost column values is a data quality issue. To address the issue, in the next lab
you will apply transformations to fill in missing values by using the product standard cost, which is
stored in the DimProduct table.

Get data from a CSV file

In this task, you will create a query based on a CSV file.


1. To add a new query, in the Power Query Editor window, on the Home ribbon tab, from inside the New
Query group, click the New Source down-arrow, and then select Text/CSV.

137
2. In the Open window, navigate to the D:\DA100\Data folder, and select the ResellerSalesTargets.csv
file.

3. Click Open.
4. In the ResellerSalesTargets.csv window, notice the data preview.
5. Click OK.

6. In the Queries pane, notice the addition of the ResellerSalesTargets query.

138
The ResellerSalesTargets CSV file contains one row per salesperson, per year. Each row records 12
monthly sales targets (expressed in thousands). The business year for the Adventure Works company
commences on July 1.

7. Notice that no columns contain empty values.

When there isn’t a monthly sales target, a hyphen character is stored instead.

8. Review the icons in each column header, to the left of the column name.

The icons represent the column data type. 123 is whole number, and ABC is text.

In the next lab, you’ll apply many transformations to achieve a different shaped result consisting of only
three columns: Date, EmployeeKey, and TargetAmount.

Get additional data from a CSV file


In this task, you will create an additional query based on a different CSV file.
1. Use the steps in the previous task to create a query based on the
D:\DA100\Data*ColorFormats.csv* file.

139
The ColorFormats CSV file contains one row per product color. Each row records the HEX codes to
format background and font colors. In the next lab, you will integrate this data with the DimProduct
query data.

Finish up

In this task, you will complete the lab.

1. On the View ribbon tab, from inside the Data Preview group, uncheck the three data preview options:

o Column quality o Column


distribution o Column
profile

2. To save the Power BI Desktop file, on the File backstage view, select Save.

140
3. When prompted to apply the queries, click Apply Later.

Applying the queries will load their data to the data model. You’re not ready to do that, as there are
many transformations that must be applied first.

6) Graded Lab Tasks


Note: The instructor can design graded lab activities according to the level of difficult and complexity
of the solved lab activities. The lab tasks assigned by the instructor should be evaluated in the same
lab.
Lab Task 1
- Load your data from any database and perform the tasks from lab contents.

141
Lab 11

Transformation using PowerBI

Objective:
The objective of this lab is to apply transformations to each of the queries created in the previous lab.

Activity Outcomes:
The activities provide hands - on practice with the following topics
• Apply various transformations
• Apply queries to load them to the data model

Instructor Note:
As pre-lab activity, read Chapter xx from the text book “”.

1) Useful Concepts

2) Solved Lab Activites


Sr.No Allocated Time Level of Complexity CLO Mapping
1 10 Low CLO-6
2 10 Low CLO-6
3 20 Medium CLO-6
4 10 Low CLO-6
5 20 Medium CLO-6
6 20 Medium CLO-6

Activity 1:
Transformation on the queriers.

Solution:

Load Data
In this exercise, you will apply transformations to each of the queries created in the previous lab.

Task 1: Configure the Salesperson query

In this task, you will configure the Salesperson query.

1. In the Power Query Editor window, in the Queries pane, select the DimEmployee query.

142
2. To rename the query, in the Query Settings pane (located at the right), in the Name box, replace the
text with Salesperson, and then press Enter.

The query name will determine the model table name. It’s recommended to define concise, yet friendly,
names.

3. In the Queries pane, verify that the query name has updated.

You will now filter the query rows to retrieve only employees who are salespeople.

4. To locate a specific column, on the Home ribbon tab, from inside the Manage Columns group, click
the Choose Columns down-arrow, and then select Go to Column.

143
Tip: This technique is useful when a query contains many columns. Usually, you can simply horizontally
scroll to locate the column.

5. In the Go to Column window, to order the list by column name, click the AZ sort button, and then
select Name.

6. Select the SalesPersonFlag column, and then click OK.

7. To filter the query, in the SalesPersonFlag column header, click the down-arrow, and then uncheck
FALSE.

8. Click OK.

144
9. In the Query Settings pane, in the Applied Steps list, notice the addition of the Filtered Rows step.

Each transformation you create results in additional step logic. It’s possible to edit or delete steps. It’s
also possible to select a step to preview the query results at that stage of transformation.

10. To remove columns, on the Home ribbon tab, from inside the Manage Columns group, click the
Choose Columns icon.

11. In the Choose Columns window, to uncheck all columns, uncheck the (Select All Columns) item.

12. To include columns, check the following six columns:

o EmployeeKey o
EmployeeNationalI
DAlternateKey o
FirstName o
LastName

o Title o
EmailAddress

145
13. Click OK.

14. In the Applied Steps list, notice the addition of another query step.

15. To create a single name column, first select the FirstName column header.

16. While pressing the Ctrl key, select the LastName column.

17. Right-click either of the select column headers, and then in the context menu, select Merge Columns.

146
Many common transformations can be applied by right-clicking the column header, and then choosing
them from the context menu. Note, however, that all transformations—and more—are available in the
ribbon.

18. In the Merge Columns window, in the Separator dropdown list, select Space.

19. In the New Column Name box, replace the text with Salesperson.

20. Click OK.

21. To rename the EmployeeNationalIDAlternateKey column, double-click the


EmployeeNationalIDAlternateKey column header.

22. Replace the text with EmployeeID, and then press Enter.

When instructed to rename columns, it’s important that you rename them exactly as described.

23. Use the previous steps to rename the EmailAddress column to UPN.

24. At the bottom-left, in the status bar, verify that the query has five columns and 18 rows.

It’s important that you do not proceed if your query does not produce the correct result—it won’t be
possible to complete later labs. If it doesn’t, refer back to the steps in this task to fix any problems.

147
Activity 2:
Configure the SalespersonRegion query

Solution:
In this task, you will configure the SalespersonRegion query.

1. In the Queries pane, select the DimEmployeeSalesTerritory query.

2. In the Query Settings pane, rename the query to SalespersonRegion.

3. To remove the last two columns, first select the DimEmployee column header.

4. While pressing the Ctrl key, select the DimSalesTerritory column header.

5. Right-click either of the select column headers, and then in the context menu, select Remove Columns.

6. In the status bar, verify that the query has two columns and 39 rows.

148
Activity 3:
Configure the Product query

Solution:
In this task, you will configure the Product query.

When detailed instructions have already been provided in the labs, the lab steps will now provide more concise
instructions. If you need the detailed instructions, you can refer back to other tasks.

1. Select the DimProduct query.


2. Rename the query to Product.

3. Locate the FinishedGoodsFlag column, and then filter the column to retrieve products that are finished
goods (i.e. TRUE).

4. Remove all columns, except the following:

o ProductKey o
EnglishProductName o
StandardCost o Color o
DimProductSubcategory

5. Notice that the DimProductSubcategory column represents a related table (it contains Value links).

6. In the DimProductSubcategory column header, at the right of the column name, click the expand
button.

7. To uncheck all columns, uncheck the (Select All Columns) item.

8. Check the EnglishProductSubcategoryName and DimProductCategory columns.

149
By selecting these two columns, a transformation will be applied to join to the
DimProductSubcategory table, and then include these columns. The
DimProductCategory column is, in fact, another related table.

9. Uncheck the Use Original Column Name as Prefix checkbox.

Query column names must always be unique. When checked, this checkbox would prefix each column
with the expanded column name (in this case DimProductSubcategory). Because it’s known that the
selected columns don’t collide with columns in the Product query, the option is deselected.

10. Click OK.

11. Notice that the transformation resulted in two columns, and that the DimProductCategory column has
been removed.

12. Expand the DimProductCategory, and then introduce only the EnglishProductCategoryName
column.

13. Rename the following four columns:

o EnglishProductName to Product o StandardCost to


Standard Cost (include a space) o

150
EnglishProductSubcategoryName to Subcategory o
EnglishProductCategoryName to Category

14. In the status bar, verify that the query has six columns and 397 rows.

Activity 4:
Configure the Reseller query

Solution:
In this task, you will configure the Reseller query.

1. Select the DimReseller query.

2. Rename the query to Reseller.

3. Remove all columns, except the following:

o ResellerKey o
BusinessType o
RellerName o
DimGeography

4. Expand the DimGeography column, to include only the following three columns:

o City o StateProvinceName o
EnglishCountryRegionName

151
5. In the Business Type column header, click the down-arrow, and then review the items, and the incorrect
spelling of warehouse.

6. Right-click the Business Type column header, and then select Replace Values.

152
7. In the Replace Values window, configure the following values:

o In the Value to Find box, enter Ware House o In the


Replace With box, enter Warehouse

8. Click OK.

9. Rename the following four columns: o BusinessType to Business Type (include a space) o
ResellerName to Reseller o StateProvinceName to State-Province o EnglishCountryRegionName
to Country-Region
10. In the status bar, verify that the query has six columns and 701 rows.

153
Activity 5:
Configure the Region query

Solution:
In this task, you will configure the Region query.

1. Select the DimSalesTerritory query.

2. Rename the query to Region.

3. Apply a filter to the SalesTerritoryAlternateKey column to remove the value 0 (zero).

154
4. Remove all columns, except the following:

o SalesTerritoryKey o
SalesTerritoryRegion o
SalesTerritoryCountry o
SalesTerritoryGroup

5. Rename the following three columns: o


SalesTerritoryRegion to Region o
SalesTerritoryCountry to Country o
SalesTerritoryGroup to Group

6. In the status bar, verify that the query has four


columns and 10 rows.

Configure the Sales query


In this task, you will configure the Sales query.

1. Select the FactResellerSales query.

2. Rename the query to Sales.

3. Remove all columns, except the following:

o SalesOrderNumber o
OrderDate o ProductKey o
ResellerKey o
EmployeeKey o
SalesTerritoryKey o
OrderQuantity o UnitPrice

155
o TotalProductCost o
SalesAmount o
DimProduct

11. Expand the DimProduct column, and then include the StandardCost column.
12. To create a custom column, on the Add Column ribbon tab, from inside the General group, click
Custom Column.

13. In the Custom Column window, in the New Column Name box, replace the text with Cost.

14. In the Custom Column Formula box, enter the following expression (after the equals symbol):

15. For your convenience, you can copy the expression from the D:\DA100\Lab03A\Assets\Snippets.txt
file. Power Query

if [TotalProductCost] = null then [OrderQuantity] * [StandardCost] else [TotalProductCost]

This expression tests if the TotalProductCost value is missing. If it is, produces a value by multiplying
the OrderQuantity value by the StandardCost value; otherwise, it uses the existing TotalProductCost
value.

16. Click OK.

17. Remove the following two columns:

o TotalProductCost o
StandardCost

156
11. Rename the following three columns:
o OrderQuantity to Quantity o UnitPrice to Unit
Price (include a space) o SalesAmount to Sales

12. To modify the column data type, in the Quantity column header, at the left of the column name, click
the 1.2 icon, and then select Whole Number.

Configuring the correct data type is important. When the column contains numeric value, it’s also
important to choose the correct type if you expect to perform mathematic calculations.

13. Modify the following three column data types to Fixed Decimal Number.

o Unit Price o
Sales o Cost

157
The fixed decimal number data type stores values with full precision, and so requires more storage
space that decimal number. It’s important to use the fixed decimal number type for financial values, or
rates (like exchange rates).

14. In the status bar, verify that the query has 10 columns and 999+ rows.

A maximum of 1000 rows will be loaded as preview data for each query.

Configure the Targets query

In this task, you will configure the Targets query.

1. Select the ResellerSalesTargets query.

158
2. Rename the query to Targets.
3. To unpivot the 12 month columns (M01-M12), first multi-select the Year and EmployeeID column
headers.

4. Right-click either of the select column headers, and then in the context menu, select Unpivot Other
Columns.

5. Notice that the column names now appear in the Attribute column, and the values appear in the Value
column.

6. Apply a filter to the Value column to remove hyphen (-) values.

7. Rename the following two columns:

o Attribute to MonthNumber (no space between the two words—it will be removed later) o
Value to Target
You will now apply transformations to produce a date column. The date will be derived from the Year
and MonthNumber columns. You will create the column by using the Columns From Examples feature.

159
8. To prepare the MonthNumber column values, right-click the MonthNumber column header, and then
select Replace Values.

9. In the Replace Values window, in the Value To Find box, enter M.

10. Click OK.

11. Modify the MonthNumber column data type to Whole Number.

12. On the Add Column ribbon tab, from inside the General group, click The Column From Examples
icon.

160
13. Notice that the first row is for year 2017 and month number 7.

14. In the Column1 column, in the first grid cell, commence enter 7/1/2017, and then press Enter.

The virtual machine uses US regional settings, so this date is in fact July 1, 2017.

15. Notice that the grid cells update with predicted values.

The feature has accurately predicted that you are combining values from two columns.

16. Notice also the formula presented above the query grid.

17. To rename the new column, double-click the Merged column header.

18. Rename the column as TargetMonth.

19. Click OK.

161
20. Remove the following columns:

o Year o
MonthNumber

21. Modify the following column data types:

o Target as fixed decimal number o


TargetMonth as date

22. To multiply the Target values by 1000, select the Target column header, and then on the Transform
ribbon tab, from inside the Number Column group, click Standard, and then select Multiply.

23. In the Multiply window, in the Value box, enter 1000.

24. Click OK.

25. In the status bar, verify that the query has three columns and 809 rows.

162
Configure the ColorFormats query

In this task, you will configure the ColorFormats query.

1. Select the ColorFormats query.

2. Notice that the first row contains the column names.

3. On the Home ribbon tab, from inside the Transform group, click Use First Row as Headers.

4. In the status bar, verify that the query has three columns and 10 rows.

Activity 6:
Update the Product query
Solution:
In this task, you will update the Product query by merging the ColorFormats query.
1. Select the Product query.

163
2. To merge the ColorFormats query, on the Home ribbon tab, from inside the Combine group, click
Merge Queries.

Merging queries allows integrating data, in this case from different data sources (SQL Server and a
CSV file).

3. In the Merge window, in the Product query grid, select the Color column header.

4. Beneath the Product query grid, in the dropdown list, select the ColorFormats query.

5. In the ColorFormats query grid, select the Color column header.

6. When the Privacy Levels window opens, for each of the two data sources, in the corresponding
dropdown list, select Organizational.

164
Privacy levels can be configured for data source to determine whether data can be shared between
sources. Setting each data source as Organizational allows them to share data, if necessary. Note that
Private data sources can never be shared with other data sources. It doesn’t mean that Private data
cannot be shared; it means that the Power Query engine cannot share data between the sources.

7. Click Save.

8. In the Merge window, click OK.

9. Expand the ColorFormats column to include the following two columns:

o Background Color Format o Font


Color Format

165
10. In the status bar, verify that the query now has eight columns and 397 rows.

Update the ColorFormats query

In this task, you will update the ColorFormats to disable its load.

1. Select the ColorFormats query.

2. In the Query Settings pane, click the All Properties link.

3. In the Query Properties window, uncheck the Enable Load To Report checkbox.

166
Disabling the load means it will not load as a table to the data model. This is done because the query
was merged with the Product query, which is enabled to lad to the data model.

4. Click OK.

Finish up

In this task, you will complete the lab.

1. Verify that you have eight queries, correctly named as follows:

o Salesperson o SalespersonRegion o Product o Reseller o Region o


Sales o Target o ColorFormats (which will not load to the data
model)
2. To load the data model, on the File backstage view, select Close & Apply.

All load-enabled queries are now loaded to the data model.

3. In the Fields pane (located at the right), notice the seven tables loaded to the data model.

167
4. Save the Power BI Desktop file.

5. Leave Power BI Desktop open.


In the next lab, you will configure data model tables and relationships.

168
Lab 12

Data Modeling in Power BI Desktop

Objective:
The objective of this lab is to develop the data model. It will involve creating relationships between tables, and
then configuring table and column properties to improve the friendliness and usability of the data model. You
will also create hierarchies and create quick measures.

Activity Outcomes:
The activities provide hands - on practice with the following topics
• Create model relationships
• Configure table and column properties
• Create hierarchies
• Create quick measures

Instructor Note:
As pre-lab activity, read Chapter xx from the textbook “Business Intelligence Guidebook: From Data
Integration to Analytics, Rick Sherman, Morgan Kaufmann Press, 2014”.

169
1) Useful Concepts

What is Data Modeling?


Data modeling is the process of defining the data structure, properties, and relationships within a data model. A
data model in Power BI is a logical representation of how data is structured and related within the tool. It is a
collection of tables and relationships between them that are used to create reports and visualizations.
A data model typically consists of one or more data sources, which can be anything from Excel spreadsheets to
cloud-based databases and one or more tables that represent the data in those sources.

2) Solved Lab Activites


Sr.No Allocated Time Level of Complexity CLO Mapping
1 20 Low CLO-6
2 20 Low CLO-6
3 20 Medium CLO-6
4 15 Medium CLO-6
5 20 Medium

Activity 1:
Create Model Relationships

Solution:
In this exercise, you will create model relationships.

Task 1: Create model relationships

In this task, you will create model relationships.

1. In Power BI Desktop, at the left, click the Model view icon.

2. If you do not see all seven tables, scroll horizontally to the right, and then drag and arrange the tables
more closely together so they can all be seen at the same time.

In Model view, it’s possible to view each table and relationships (connectors between tables). Presently,
there are no relationships because you disabled the data load relationship options.

170
3. To return to Report view, at the left, click the Report view icon.

4. To view all table fields, in the Fields pane, right-click an empty area, and then select Expand All.

5. To create a table visual, in the Fields pane, from inside the Product table, check the Category field.

From now on, the labs will use a shorthand notation to reference a field. It will look like this: Product
| Category.

6. To add a column to the table, in the Fields pane, check the Sales | Sales field.

7. Notice that the table visual lists four product categories, and that the sales value is the same for each,
and the same for the total.

171
The issue is that the table is based on fields from different tables. The expectation is that each product
category displays the sales for that category. However, because there isn’t a model relationship between
these tables, the Sales table is not filtered. You will now add a relationship to propagate filters between
the tables.

8. On the Modeling ribbon tab, from inside the Relationships group, click Manage Relationships.

9. In the Manage Relationships window, notice that no relationships are yet defined.

10. To create a relationship, click New.

11. In the Create Relationship window, in the first dropdown list, select the Product table.

12. In the second dropdown list (beneath the Product table grid), select the Sales table.

172
13. Notice the ProductKey columns in each table have been selected.

The columns were automatically selected because they share the same name.

14. In the Cardinality dropdown list, notice that One To Many is selected.

The cardinality was automatically detected, because Power BI understands that the ProductKey column
from the Product table contains unique values. One-to-many relationships are the most common
cardinality, and all relationship you create in this lab will be this type. In the Cross Filter Direction
dropdown list, notice that Single is selected.

Single filter direction means that filters propagate from the “one side” to the “many side”. In this
case, it means filters applied to the Product table will propagate to the Sales table, but not in the
other direction. Notice that the Mark This Relationship Active is checked.

Active relationships will propagate filters. It’s possible to mark a relationship as inactive so filters don’t
propagate. Inactive relationships can exist when there are multiple relationship paths between tables.
In which case, model calculations can use special functions to activate them.

15. Click OK.

16. In the Manage Relationships window, notice that the new relationship is listed, and then click Close.

17. In the report, notice that the table visual has updated to display different values for each product
category.

173
18. Filters applied to the Product table now propagate to the Sales table.

19. Switch to Model view, and then notice there is now a connector between the two tables.

20. In the diagram, notice that you can interpret the cardinality which is represented by the 1 and *****
indicators.

Filter direction is represented by the arrow head. And, a solid line represents an active relationship; a
dashed line represents an inactive relationship.

21. Hover the cursor over the relationship to reveal the related columns.

There’s an easier way to create a relationship. In the model diagram, you can drag and drop columns
to create a new relationship.
22. To create a new relationship, from the Reseller table, drag the ResellerKey column on to the
ResellerKey column of the Sales table.

174
Tip: Sometime a column doesn’t want to be dragged. If this situation arises, select a different column,
and then select the column you intend to drag again, and try again.

23. Create the following two model relationships:

o Region | SalesTerritoryKey to Sales | SalesTerritoryKey o Salesperson


| EmployeeKey to Sales | EmployeeKey

In this lab, the SalespersonRegion and Targets tables will remain disconnected. There’s a many-to-
many relationship between salespeople and regions, you will work this advanced scenario in the next
lab.

24. In the diagram, placing the tables with the Sales table in the center, and arranging the related tables
about it.

25. Save the Power BI Desktop file.

175
Activity 2:
Configure Tables

Solution:
In this exercise, you will configure each table by creating hierarchies, and hiding, formatting, and categorizing
columns.

Task 1: Configure the Product table

In this task, you will configure the Product table.

1. In Model view, in the Fields pane, if necessary, expand the Product table.

2. To create a hierarchy, in the Fields pane, right-click the Category column, and then select Create
Hierarchy.

3. In the Properties pane (to the left of the Fields pane), in the Name box, replace the text with Products.

4. To add the second level to the hierarchy, in the Hierarchy dropdown list, select Subcategory.

5. To add the third level to the hierarchy, in the Hierarchy dropdown list, select Product.

6. To complete the hierarchy design, click Apply Level Changes.

176
Tip: Don’t forget to click Apply Level Changes—it’s a common mistake to overlook this step.

7. In the Fields pane, notice the Products hierarchy.

8. To reveal the hierarchy levels, expand the Products hierarchy.

177
9. To organize columns into a display folder, in the Fields pane, first select the Background Color
Format column.

10. While pressing the Ctrl key, select the Font Color Format.

11. In the Properties pane, in the Display Folder box, enter Formatting.

12. In the Fields pane, notice that the two columns are now inside a folder.

Display folders are a great way to declutter tables—especially those that contain lots of fields.

Task 2: Configure the Region table

In this task, you will configure the Region table.

1. In the Region table, create a hierarchy named Regions, with the following three levels:

o Group o
Country o
Region

178
2. Select the Country column (not the Country level).

3. In the Properties pane, expand the Advanced section, and then in the Data Category dropdown list,
select Country/Region.

Data categorization can provide hints to the report designer. In this case, categorizing the column as
country or region, provides more accurate information when rendering a map visualization.

Task 3: Configure the Reseller table

In this task, you will configure the Reseller table.

1. In the Reseller table, create a hierarchy named Resellers, with the following two levels:

o Business Type o
Reseller

179
2. Create a second hierarchy named Geography, with the following four levels:

o Country-Region o
State-Province o City o
Reseller

3. Categorize the following three columns: o Country-Region as Country/Region o State-Province as


State or Province o City as City

Task 4: Configure the Sales table

In this task, you will configure the Sales table.

1. In the Sales table, select the Cost column.

2. In the Properties pane, in the Description box, enter: Based on standard cost

180
Descriptions can be applied to table, columns, hierarchies, or measures. In the Fields pane,
description text is revealed in a tooltip when a report author hovers their cursor over the field.

3. Select the Quantity column.


4. In the Properties pane, from inside the Formatting section, slide the Thousands Separator property
to On.

5. Select the Unit Price column.

6. In the Properties pane, from inside the Formatting section, slide the Decimal Places property to 2.

7. In the Advanced group (you may need to scroll down to locate it), in the Summarize By dropdown list,
select Average.

181
By default, numeric columns will summarize by summing values together. This default behavior is not
suitable for a column like Unit Price, which represents a rate. Setting the default summarization to
average will produce a useful and accurate result.

Task 5: Bulk update properties


In this task, you will update multiple columns in a single bulk update. You will use this approach to hide
columns, and format column values.

1. While pressing the Ctrl key, select the following 13 columns (spanning multiple tables):

o Product | ProductKey o Region |


SalesTerritoryKey o Reseller | ResellerKey o Sales
| EmployeeKey o Sales | ResellerKey o Sales |
SalesOrderNumber o Sales | SalesTerritoryKey o
Salesperson | EmployeeID o Salesperson |
EmployeeKey o Salesperson | UPN o
SalespersonRegion | EmployeeKey o
SalespersonRegion | SalesTerritoryKey o Targets |
EmployeeID

2. In the Properties pane, slide the Is Hidden property to On.

The columns were hidden because they are either used by relationships or will be used in row-level
security configuration or calculation logic.

You will define row-level security in the next lab using the UPN column. You will use the
SalesOrderNumber in a calculation in Lab 06A.

3. Multi-select the following columns:

182
o Product | Standard Cost o Sales
| Cost
o Sales | Sales

4. In the Properties pane, from inside the Formatting section, set the Decimal Places property to 0
(zero).

Activity 3:
Review the Model Interface

Solution:
In this exercise, you will switch to Report view, and review the model interface.

Task 1: Review the model interface

In this task, you will switch to Report view, and review the model interface.

1. Switch to Report view.

2. In the Fields pane, notice the following:

o Columns, hierarchies and their levels are fields, which can be used to configure report visuals
o Only fields relevant to report authoring are visible

o The SalespersonRegion table is not visible—because all of its fields are hidden o Spatial fields
in the Region and Reseller table are adorned with a spatial icon o Fields adorned with the sigma
symbol (Ʃ) will summarize, by default o A tooltip appears when hovering the cursor over the Sales
| Cost field

3. Expand the Sales | OrderDate field, and then notice that it reveals a date hierarchy.

183
The Targets | TargetMonth presents the same hierarchy. These hierarchies were not created by you.
They are created automatically. There is a problem, however. The Adventure Works financial year
commences on July 1 of each year. But, the date hierarchy year commences on January 1 of each year.

You will now turn this automatic behavior off. In Lab 06A, you will use DAX to create a date table, and
configure it define the Adventure Works’ calendar.

4. To turn off auto/date time, click the File ribbon tab to open the backstage view.

5. At the left, select Options and Settings, and then select Options.

184
6. In the Options window, at the left, in the Current File group, select Data Load.

7. In the Time Intelligence section, uncheck Auto Date/Time.

8. Click OK.

9. In the Fields pane, notice that the date hierarchies are no longer available.

Activity 4:
Create Quick Measures

Solution:
In this exercise, you will create two quick measures.

Task 1: Create quick measures

In this task, you will create two quick measures to calculate profit and profit margin.

185
1. In the Fields pane, right-click the Sales table, and then select New Quick Measure.

2. In the Quick Measures window, in the Calculation dropdown list, from inside the Mathematical
Operations group, select Subtraction.

186
3. In the Fields pane, expand the Sales table.

4. Drag the Sales field into the Base Value box.

5. Drag the Cost field into the Value to Subtract box.

187
6. Click OK.

A quick measure creates the calculation for you. They’re easy and fast to create for simple and common
calculations. In the Fields pane, inside the Sales table, notice that new measure.

Measures are adorned with the calculator icon.

7. To rename the measure, right-click it, and then select Rename.

188
Tip: To rename a field, you can also double-click it, or select it and press F2.

8. Rename the measure to Profit, and then press Enter.

9. In the Sales table, add a second quick measure, based on the following requirements:

o Use the Division mathematical operation o Set the


Numerator to the Sales | Profit field o Set the
Denominator to Sales | Sales field o Rename the measure
as Profit Margin

189
10. Ensure the Profit Margin measure is selected, and then on the Measure Tools contextual ribbon, set
the format to Percentage, with two decimal places.

11. To test the two measures, first select the table visual on the report page.

12. In the Fields pane, check the two measures.

190
13. Click and drag the right guide to widen the table visual.

14. Verify that the measures produce reasonable result that are correctly formatted.

191
Activity 5:
Advanced Data Modeling in Power BI Desktop

Solution:
In this activity, you will create a many-to-many relationship between the Salesperson table and the Sales table.
You will also enforce row-level security to ensure that a salesperson can only analyze sales data for their
assigned region(s). You will also learn how to:

• Configure many-to-many relationships


• Enforce row-level security

Task 1: Create a many-to-many relationship

In this task, you will create a many-to-many relationship between the Salesperson table and the Sales table.

1. In Power BI Desktop, in Report view, in the Fields pane, check the follow two fields to create a table
visual:

o Salesperson | Salesperson o Sales |


Sales

The table displays sales made by each salesperson. However, there is another relationship between
salespeople and sales. Some salespeople belong to one, two, or possibly more sales regions. In addition,
sales regions can have multiple salespeople assigned to them.

From a performance management perspective, a salesperson’s sales (based on their assigned


territories) need to be analyzed and compared with sales targets. In this exercise, you will create
relationships to support this analysis.

2. Notice that Michael Blythe has sold almost $9 million.

3. Switch to Model view.

192
4. Use the drag-and-drop technique to create the following two model relationships: o Salesperson |
EmployeeKey to SalespersonRegion | EmployeeKey o Region | SalesTerritoryKey to
SalespersonRegion | SalesTerritoryKey

The SalespersonRegion table can be considered to be a bridging table.

5. Switch to Report view, and then notice that the visual has not updated—the sales result for Michael
Blythe has not changed.

6. Switch back to Model view, and then follow the relationship filter directions (arrowhead) from the
Salesperson table.

Consider this: the Salesperson table filters the Sales table. It also filters the
SalespersonRegion table, but it does not continue to propagate to the Region table (the
arrowhead is pointing the wrong way).

7. To edit the relationship between the Region and SalespersonRegion tables, doubleclick the
relationship.

8. In the Edit Relationship window, in the Cross Filter Direction dropdown list, select Both.

9. Check the Apply Security Filter in Both Directions checkbox.

This setting will ensure that bi-directional filtering is applied when row-level security is being enforced.
You will configure a security role in the next exercise.

193
10. Click OK.

11. Notice that the relationship has a double arrowhead.

12. Switch to Report view, and then notice that the sales values have not changed.

The issue now relates to the fact that there are two possible filter propagation paths between the
Salesperson and Sales tables. This ambiguity is internally resolved, based on a “least number of tables”
assessment. To be clear, you should not design models with this type of ambiguity—it will be addressed
in part in this lab, and by the next lab.

13. Switch to Model view.

14. To force filter propagation via the bridging table, double-click the relationship between the Salesperson
and Sales tables.

15. In the Edit Relationship window, uncheck the Make This Relationship Active checkbox.

16. Click OK.

194
The filter propagation is now forced to take the only active path.

17. In the diagram, notice that the inactive relationship is represented by a dashed line.

18. Switch to Report view, and then notice that the sales for Michael Blythe is now nearly $22 million.

19. Notice also, that the sales for each salesperson—if added—would exceed the total.

This observation is a many-to-many relationships, due to the double, triple, etc.


counting of regional sales results. Consider Brian Welcker, the second salesperson listed. His sales
amount equals the total sales amount. It’s the correct result simply due to the fact the he’s the Director
of Sales; his sales are measured by the sales of all regions.

While the many-to-many relationship is now working, it’s now not possible to analyze sales made by a
salesperson (the relationship is inactive). In the next lab, you’ll introduce a calculated table that will
represent salesperson for performance analysis (of their regions).

20. Switch to Modeling view, and then in the diagram, select the Salesperson table.

21. In the Properties pane, in the Name box, replace the text with Salesperson (Performance).

The renamed table now reflects its purpose: it is used to report and analyze the performance of
salespeople based on the sales of their assigned sales regions.

195
Task 2: Relate the Targets table

In this task, you will create a relationship to the Targets table

1. Create a relationship from the Salesperson (Performance) | EmployeeID column and the Targets |
EmployeeID column.

2. In Report view, add the Targets | Target field to the table visual.

Widen the table visual to reveal all data.

3. It’s now possible to visualize sales and targets—but take care, for two reasons. First, there is no filter
on a time period, and so targets also including future target values. Second, targets are not additive, and
so the total should not be displayed. They can either disabled by using a visual formatting property or
removed by using calculation logic. You’ll write a target measure that will return BLANK when more
than one salesperson is filtered.

Enforce row-level security


In this exercise, you will enforce row-level security to ensure a salesperson can only ever see sales made in their
assigned region(s).

Task 1: Enforce row-level security

In this task, you will enforce row-level security to ensure a salesperson can only ever see sales made in their
assigned region(s).

1. Switch to Data view.

196
2. In the Fields pane, select the Salesperson (Performance) table.

3. Review the data, noticing that Michael Blythe (EmployeeKey 281) has been assigned your Power BI
account (UPN column).
Recall that Michael Blythe is assigned to three sales regions: US Northeast, US Central, and US
Southeast.

4. Switch to Report view.

5. On the Modeling ribbon tab, from inside the Security group, click Manage Roles.

6. In the Manage Roles window, click Create.

7. In the box, replace the selected text with the name of the role: Salespeople, and then press Enter.

197
8. To assign a filter, for the Salesperson (Performance) table, click the ellipsis (…) character, and then
select Add Filter | [UPN].

9. In the Table Filter DAX Expression box, modify the expression by replacing “Value” with
USERNAME().

USERNAME() is a Data Analysis Expressions (DAX) function that retrieves the authenticated user. This
means that the Salesperson (Performance) table will filter by the User Principal Name (UPN) of the
user querying the model.

10. Click Save.

11. To test the security role, on the Modeling ribbon tab, from inside the Security group, click View As.

198
12. In the View as Roles window, check the Other User item, and then in the corresponding box, enter
your account name.
Tip: You can copy it from the MySettings.txt file.

13. Check the Salespeople role.

This configuration results in using the Salespeople role and impersonating the user with your account
name.

14. Click OK.

15. Notice the yellow banner above the report page, describing the test security context.

16. In the table visual, notice that only the salesperson Michael Blythe is listed.

17. To stop testing, at the right of the yellow banner, click Stop Viewing.

199
7) Graded Lab Tasks
Note: The instructor can design graded lab activities according to the level of difficult and complexity
of the solved lab activities. The lab tasks assigned by the instructor should be evaluated in the same
lab.
Lab Task 1
Import a dataset from Excel into Power BI Desktop. Cleanse the data by removing duplicates and null values.
Create relationships between different tables in the dataset. Add calculated columns using Power BI's query
editor. Load the prepared data into Power BI Desktop's data model.

200
Lab 13
Using DAX in Power BI Desktop

Objective:
The objective of this lab is to learn how to create calculated tables, calculated columns, and simple measures
using Data Analysis Expressions (DAX).

Activity Outcomes:
The activities provide hands - on practice with the following topics
• Create calculated tables
• Create calculated columns
• Create measures

Instructor Note:
As pre-lab activity, read Chapter xx from the textbook “Business Intelligence Guidebook: From Data
Integration to Analytics, Rick Sherman, Morgan Kaufmann Press, 2014”..

201
1) Useful Concepts

What is DAX?
DAX or Data Analysis Expressions drive all the calculations you can perform in Power BI. DAX formulas are
versatile, dynamic, and very powerful – they allow you to create new fields and even new tables in your model.
While DAX is most commonly associated with Power BI, you can also find DAX formulas in Power Pivot in
Excel and SQL Server Analysis Services (SSAS).
DAX formulas are made up of 3 core components and this tutorial will cover each of these:
• Syntax – Proper DAX syntax is made up of a variety of elements, some of which are common to all
formulas.
• Functions – DAX functions are predefined formulas that take some parameters and perform a specific
calculation.
• Context – DAX uses context to determine which rows should be used to perform a calculation.

Where are DAX Formulas Used in Power BI?


There are three ways you can use DAX formulas in Power BI:
1. Calculated Tables - These calculations will add an additional table to the report based on a formula.
2. Calculated Columns - These calculations will add an additional column to a table based on a formula.
These columns are treated like any other field in the table.
3. Measures - These calculations will add a summary or aggregated measure to a table based on a formula.

How to Write a DAX Formula


DAX formulas are intuitive and easy to read. This makes it easy to understand the basics of DAX so you can
start writing your own formulas relatively quickly. Let’s go over the building blocks of proper DAX syntax.

1. The name of the measure or calculated column


2. The equal-to operator (“=”) indicates the start of the formula
3. A DAX function
4. Opening (and closing) parentheses (“()”)
5. Column and/or table references
6. Note that each subsequent parameter in a function is separated by a comma (“,”)

2) Solved Lab Activites


Sr.No Allocated Time Level of Complexity CLO Mapping

202
1 20 Low CLO-6
2 20 Low CLO-6
3 20 Medium CLO-6
4 20 Medium CLO-6
5 20 Medium

Activity 1:
Create Calculated Tables

Solution:
In this exercise, you will create two calculated tables. The first will be the Salesperson table, to allow a direct
relationship between it and the Sales table. The second will be the Date table.

Task 1: Create the Salesperson table

In this task, you will create the Salesperson table (direct relationship to Sales).

1. In Power BI Desktop, in Report view, on the Modeling ribbon, from inside the Calculations
group, click New Table.

2. In the formula bar (which opens directly beneath the ribbon when creating or editing
calculations), type Salesperson =, press Shift+Enter, type 'Salesperson (Performance)', and
then press Enter.

For your convenience, all DAX definitions in this lab can be copied from the
D:\DA100\Lab06A\Assets\Snippets.txt file.

A calculated table is created by first entering the table name, followed by the equals symbol (=),
followed by a DAX formula that returns a table. The table name cannot already exist in the data model.

The formula bar supports entering a valid DAX formula. It includes features like autocomplete,
Intellisense and color-coding, enabling you to quickly and accurately enter the formula.

203
This table definition creates a copy of the Salesperson (Performance) table. It copies the data only,
however properties like visibility, formatting, etc. are not copied.

Tip: You are encouraged to enter “white space” (i.e. carriage returns and tabs) to layout formulas in
an intuitive and easy-to-read format—especially when formulas are long and complex. To enter a
carriage return, press Shift+Enter. “White space” is optional.

3. In the Fields pane, notice that the table icon is a shade of blue (denoting a calculated table).

Calculated tables are defined by using a DAX formula which returns a table. It is important to
understand that calculated tables increase the size of the data model because they materialize and store
values. They are recomputed whenever formula dependencies are refreshed, as will be the case in this
data model when new (future) date values are loaded into tables.

Unlike Power Query-sourced tables, calculated tables cannot be used to load data from external data
sources. They can only transform data based on what has already been loaded into the data model.

4. Switch to Model view.

5. Notice that the Salesperson table is available (take care, it might be hidden from view—scroll
horizontally to locate it).
6. Create a relationship from the Salesperson | EmployeeKey column to the Sales | EmployeeKey
column.

7. Right-click the inactive relationship between the Salesperson (Performance) and Sales tables, and then
select Delete.

8. When prompted to confirm the deletion, click Delete.

204
9. In the Salesperson table, multi-select the following columns, and then hide them:

o EmployeeID o
EmployeeKey o
UPN

10. In the diagram, select the Salesperson table.

11. In the Properties pane, in the Description box, enter: Salesperson related to s sale

Recall that description appear as tooltips in the Fields pane when the user hovers their cursor over a
table or field.

12. For the Salesperson (Performance) table, set the description to: Salesperson related to region(s)

The data model now provides to alternatives when analyzing salespeople. The Salesperson
table allows analyzing sales made by a salesperson, while the Salesperson (Performance)
table allows analyzing sales made in the sales region(s) assigned to the salesperson.

Task 2: Create the Date table

In this task, you will create the Date table.

1. Switch to Data view.

2. On the Home ribbon tab, from inside the Calculations group, click New Table.

3. In the formula bar, enter the following:

DAX

Date =

205
CALENDARAUTO(6)

The CALENDARAUTO() function returns a single-column table consisting of date values. The “auto”
behavior scans all data model date columns to determine the earliest and latest date values stored in
the data model. It then creates one row for each date within this range, extending the range in either
direction to ensure full years of data is stored.

This function can take a single optional argument which is the last month number of a year. When
omitted, the value is 12, meaning that December is the last month of the year. In this case 6 is entered,
meaning that June is the last month of the year.

4. Notice the column of date values.

If the column does not appear, in the Fields pane, select a different table, and then select the Date table.

The dates shown are formatted using US regional settings (i.e. mm/dd/yyyy).

5. At the bottom-left corner, in the status bar, notice the table statistics, confirming that 1826 rows of data
have been generated, which represents five full years’ data.

Task 3: Create calculated columns

In this task, you will add additional columns to enable filtering and grouping by different time periods. You will
also create a calculated column to control the sort order of other columns.

1. On the Table Tools contextual ribbon, from inside the Calculations group, click New Column.

206
2. In the formula bar, type the following, and then press Enter:

DAX

Year =

"FY" & YEAR('Date'[Date]) + IF(MONTH('Date'[Date]) > 6, 1)

A calculated column is created by first entering the column name, followed by the equals symbol (=),
followed by a DAX formula that returns a single-value result. The column name cannot already exist in
the table.

The formula uses the date’s year value but adds one to the year value when the month is after June. This
is how fiscal years at Adventure Works are calculated.

3. Verify that the new column was added.

4. Use the snippets file definitions to create the following two calculated columns for the Date table: o
Quarter o Month

207
5. To validate the calculations, switch to Report view.

6. To create a new report page, at the bottom-left, click the plus icon.

7. To add a matrix visual to the new report page, in the Visualizations pane, select the matrix visual type.

Tip: You can hover the cursor over each icon to reveal a tooltip describing the visual type.

8. In the Fields pane, from inside the Date table, drag the Year field into the Rows well.

208
9. Drag the Month field into the Rows well, directly beneath the Year field.

10. At the top-right of the matrix visual, click the forked-double arrow icon (which will expand all years
down one level).

11. Notice that the years expand to months, and that the months are sorted alphabetically rather than
chronologically.

209
By default, text values sort alphabetically, numbers sort from smallest to largest, and dates sort from
earliest to latest.

12. To customize the Month field sort order, switch to Data view.

13. Add the MonthKey column to the Date table.

DAX

MonthKey =

(YEAR('Date'[Date]) * 100) + MONTH('Date'[Date]) + MONTH ( [Date] ))

This formula computes a numeric value for each year/month combination.

14. In Data view, verify that the new column contains numeric values (e.g. 201707 for July 2017, etc.).

210
15. Switch back to Report view.

16. In the Fields pane, ensure that the Month field is selected (when selected, it will have a dark gray
background).

17. On the Column Tools contextual ribbon, from inside the Sort group, click Sort by Column, and then
select MonthKey.

18. In the matrix visual, notice that the months are now chronologically sorted.

211
Task 4: Complete the Date table

In this task, you will complete the design of the Date table by hiding a column and creating a hierarchy. You
will then create relationships to the Sales and Targets tables.

1. Switch to Model view.

2. In the Date table, hide the MonthKey column.

3. In the Date table, create a hierarchy named Fiscal, with the following three levels:

o Year o
Quarter o
Month

4. Create the follow two model relationships:

o Date | Date to Sales | OrderDate o Date | Date to


Targets | TargetMonth

5. Hide the following two columns:

212
o Sales | OrderDate o Targets |
TargetMonth

Task 5: Mark the Date table

In this task, you will mark the Date table as a date table.

1. Switch to Report view.

2. In the Fields pane, select the Date table (not field).

3. On the Table Tools contextual ribbon, from inside the Calendars group, click Mark as Date Table,
and then select Mark as Date Table.

4. In the Mark as Date Table window, in the Date Column dropdown list, select Date.

5. Click OK.

6. Save the Power BI Desktop file.

213
Power BI Desktop now understands that this table defines date (time). This is important when relying
on time intelligence calculations.

Note that this design approach for a date table is suitable when you don’t have a date table in your data
source. If you have access to a data warehouse, it would be appropriate to load date data from its date
dimension table rather than “redefining” date logic in your data model.

Activity 2:
Create Measures

Solution:
In this exercise, you will create and format several measures.

Task 1: Create simple measures

In this task, you will create simple measures. Simple measures aggregate a single column or table.

1. In Report view, on Page 2, in the Fields pane, drag the Sales | Unit Price field into the matrix visual.

Recall that in previous lab, you set the Unit Price column to summarize by Average. The result you see
in the matrix visual is the monthly average unit price.

2. In the visual fields pane (located beneath the Visualizations pane), in the Values well, notice that Unit
Price is listed.

214
3. Click the down-arrow for Unit Price, and then notice the available menu options.

Visible numeric columns allow report authors to decide at report design time how a column will
summarize (or not). This can result in inappropriate reporting. Some data modelers do not like leaving
things to chance, however, and choose to hide these columns and instead expose aggregation logic
defined by measures. This is the approach you will now take in this lab.

4. To create a measure, in the Fields pane, right-click the Sales table, and then select New Measure.

215
5. In the formula bar, add the following measure definition:
DAX
Avg Price = AVERAGE(Sales[Unit Price])

6. Add the Avg Price measure to the matrix visual.


7. Notice that it produces the same result as the Unit Price column (but with different formatting).
8. In the Values well, open the context menu for the Avg Price field, and notice that it is not possible to
change the aggregation technique.

9. Use the snippets file definitions to create the following five measures for the Sales table: o Median
Price o Min Price o Max Price o Orders o Order Lines

The DISTINCTCOUNT() function used in the Orders measure will count orders only once (ignoring
duplicates). The COUNTROWS() function used in the Order

216
Lines measure operates over a table. In this case, the number of orders is calculated by counting the
distinct SalesOrderNumber column values, while the number of order lines is simply the number of
table rows (each row is a line of an order).

10. Switch to Model view, and then multi-select the four price measures: Avg Price, Max Price, Median
Price, and Min Price.

11. For the multi-selection of measures, configure the following requirements:

o Set the format to two decimal places o Assign to a


display folder named Pricing

12. Hide the Unit Price column.

The Unit Price column is now not available to report authors. They must use the measure you’ve added
to the model. This design approach ensures that report authors won’t inappropriately aggregate prices,
for example, by summing them.
13. Multi-select the Orders and Order Lines measures, and configure the following requirements:

o Set the format use the thousands separator o Assign to a


display folder named Counts

14. In Report view, in the Values well of the matrix visual, for the Unit Price field, click X to remove it.

217
15. Increase the size of the matrix visual to fill the page width and height.

16. Add the following five new measures to the matrix visual:

o Median Price o
Min Price o Max
Price o Orders o
Order Lines

17. Verify that the results looks sensible and are correctly formatted.

218
Task 2: Create additional measures

In this task, you will create additional measures that use more complex expressions.

1. In Report view, select Page 1.

2. Review the table visual, noticing the total for the Target column.

Summing the target values together doesn’t make sense because salespeople targets are set for
each salesperson based on their sales region assignment(s). A target value should only be shown
when a single salesperson is filtered. You will implement a measure now to do just that.

3. In the table visual, remove the Target field.


4. Rename the Targets | Target column as Targets | TargetAmount.
Tip: There are several ways to rename the column in Report view: In the Fields pane, you can
right-click the column, and then select Rename—or, double-click the column, or press F2.

You’re about to create a measure named Target. It’s not possible to have a column and measure
in the same table, with the same name.

5. Create the following measure on the Targets table:

DAX
Target =

IF(

HASONEVALUE('Salesperson (Performance)'[Salesperson]),

SUM(Targets[TargetAmount])

The HASONEVALUE() function tests whether a single value in the Salesperson column is filtered.
When true, the expression returns the sum of target amounts (for just that salesperson). When false,
BLANK is returned.

6. Format the Target measure for zero decimal places.

Tip: You can use the Measure Tools contextual ribbon.

7. Hide the TargetAmount column.

8. Add the Target measure to the table visual.

220
9. Notice that the Target column total is now BLANK.

10. Use the snippets file definitions to create the following two measures for the Targets table: o
Variance o Variance Margin

11. Format the Variance measure for zero decimal places.

12. Format the Variance Margin measure as percentage with two decimal places.

13. Add the Variance and Variance Margin measures to the table visual.

14. Widen the table visual so all values are displayed.

221
While it appears all salespeople are not meeting target, remember that the measures aren’t yet
filtered by a specific time period. You’ll produce sales performance reports that filter by a user-
selected time period in Lab 07A.

15. At the top-right corner of the Fields pane, collapse and then expand open the pane.

Collapsing and re-opening the pane resets the content.

16. Notice that the Targets table now appears at the top of the list.

Tables that comprise only visible measures are automatically listed at the top of the list.

Activity 3:
create measures with DAX expressions involving filter context manipulation.
Solution:
• Use the CALCULATE() function to manipulate filter context
• Use Time Intelligence functions

222
Work with Filter Context
In this exercise, you will create measures with DAX expressions involving filter context manipulation.

Task 1: Create a matrix visual

In this task, you will create a matrix visual to support testing your new measures.

1. In Power BI Desktop, in Report view, create a new report page.

2. On Page 3, add a matrix visual.

3. Resize the matrix visual to fill the entire page.


4. To configure the matrix visual fields, from the Fields pane, drag the Region | Regions hierarchy,
and drop it inside the visual.

5. Add also the Sales | Sales field.

6. To expand the entire hierarchy, at the top-right of the matrix visual, click the forkeddouble arrow
icon twice.

223
Recall that the Regions hierarchy has the levels Group, Country, and Region.

7. To format the visual, beneath the Visualizations pane, select the Format pane.

8. In the Search box, enter Stepped.

9. Set the Stepped Layout property to Off.

10. Verify that the matrix visual has four column headers.

224
At Adventure Works, the sales regions are organized into groups, countries, and regions. All
countries—except the United States—have just one region, which is named after the country. As
the United States is such a large sales territory, it is divided into five regions.

You’ll create several measures in this exercise, and then test them by adding them to the matrix
visual.

Task 2: Manipulate filter context

In this task, you will create several measures with DAX expressions that use the CALCULATE() function
to manipulate filter context.

1. Add a measure to the Sales table, based on the following expression:

DAX

Sales All Region =


CALCULATE(SUM(Sales[Sales]), REMOVEFILTERS(Region))

225
The CALCULATE() function is a powerful function used to manipulate the filter context. The first
argument takes an expression or a measure (a measure is just a named expression). Subsequent
arguments allow modifying the filter context.

The REMOVEFILTERS() function removes active filters. It can take either no arguments, or a
table, a column, or multiple columns as its argument.

In this formula, the measure evaluates the sum of the Sales column in a modified filter context,
which removes any filters applied to the Region table.

2. Add the Sales All Region measure to the matrix visual.

3. Notice that the Sales All Region measure computes the total of all region sales for each region,
country (subtotal) and group (subtotal).

This measure is yet to deliver a useful result. When the sales for a group, country, or region is
divided by this value it produces a useful ratio known as “percent of grand total”.

4. In the Fields pane, ensure that the Sales All Region measure is selected, and then in the formula
bar, replace the measure name and formula with the following formula:

226
Tip: To replace the existing formula, first copy the snippet. Then, click inside the formula bar and
press Ctrl+A to select all text. Then, press Ctrl+V to paste the snippet to overwrite the selected
text. Then press Enter.

DAX

Sales % All Region =


DIVIDE(
SUM(Sales[Sales]),
CALCULATE(
SUM(Sales[Sales]),
REMOVEFILTERS(Region)

)
)

The measure has been renamed to accurately reflect the updated formula. The DIVIDE()
function divides the Sales measure (not modified by filter context) by the Sales measure in a
modified context which removes any filters applied to the Region table.

5. In the matrix visual, notice that the measure has been renamed and that a different values now
appear for each group, country, and region.

6. Format the Sales % All Region measure as a percentage with two decimal places.

7. In the matrix visual, review the Sales % All Region measure values.

8. Add another measure to the Sales table, based on the following expression, and format as a
percentage:

DAX

227
Sales % Country =
DIVIDE(
SUM(Sales[Sales]),
CALCULATE(
SUM(Sales[Sales]),
REMOVEFILTERS(Region[Region])
)
)

9. Notice that the Sales % Country measure formula differs slightly from the Sales % All Region
measure formula.

The different is that the denominator modifies the filter context by removing filters on the Region
column of the Region table, not all columns of the Region table. It means that any filters applied
to the group or country columns are preserved. It will achieve a result which represents the sales
as a percentage of country.

10. Add the Sales % Country measure to the matrix visual.

11. Notice that only the United States’ regions produce a value which is not 100%.

Recall that only the United States has multiple regions. All other countries have a single region
which explains why they are all 100%.

12. To improve the readability of this measure in visual, overwrite the Sales % Country measure with
this improved formula.

DAX

Sales % Country =
IF(
ISINSCOPE(Region[Region]),

228
DIVIDE(
SUM(Sales[Sales]),
CALCULATE(
SUM(Sales[Sales]),
REMOVEFILTERS(Region[Region)
)
)
)

Embedded within the IF() function, the ISINSCOPE() function is used to test whether the region
column is the level in a hierarchy of levels. When true, the DIVIDE() function is evaluated. The
absence of a false part means that blank is returned when the region column is not in scope.

13. Notice that the Sales % Country measure now only returns a value when a region is in scope.

14. Add another measure to the Sales table, based on the following expression, and format as a
percentage:

DAX

Sales % Group =
DIVIDE(
SUM(Sales[Sales]),
CALCULATE(
SUM(Sales[Sales]),
REMOVEFILTERS(
Region[Region],
Region[Country]
)
)
)

229
To achieve sales as a percentage of group, two filters can be applied to effectively
remove the filters on two columns.

15. Add the Sales % Group measure to the matrix visual.

16. To improve the readability of this measure in visual, overwrite the Sales % Group
measure with this improved formula.

DAX

Sales % Group =
IF(
ISINSCOPE(Region[Region])
|| ISINSCOPE(Region[Country]),
DIVIDE(
SUM(Sales[Sales]),
CALCULATE(
SUM(Sales[Sales]),
REMOVEFILTERS(
Region[Region],
Region[Country]
)
)
)
)

17. Notice that the Sales % Group measure now only returns a value when a region or country is in
scope.

18. In Model view, place the three new measures into a display folder named Ratios.

19. Save the Power BI Desktop file.

The measures added to the Sales table have modified filter context to achieve hierarchical
navigation. Notice that the pattern to achieve the calculation of a subtotal requires removing some
columns from the filter context, and to arrive at a grand total, all columns must be removed.

230
Activity 4:
Work with Time Intelligence

Solution
In this exercise, you will create a sales year-to-date (YTD) measure and sales year-over-year (YoY) growth
measure.

Task 1: Create a YTD measure

In this task, you will create a sales YTD measure.

1. In Report view, on Page 2, notice the matrix visual which displays various measures with years
and months grouped on the rows.

2. Add a measure to the Sales table, based on the following expression, and formatted to zero decimal
places:

DAX

Sales YTD =
TOTALYTD(SUM(Sales[Sales]), 'Date'[Date], "6-30")

The TOTALYTD() function evaluates an expression—in this case the sum of the Sales column—
over a given date column. The date column must belong to a date table marked as a date table, as
you did in Lab 06A. The function can also take a third optional argument representing the last date
of a year. The absence of this date means that December 31 is the last date of the year. For
Adventure Works, June in the last month of their year, and so “6-30” is used.

3. Add the Sales field and the Sales YTD measure to the matrix visual.

4. Notice the accumulation of sales values within the year.

231
The TOTALYTD() function performs filter manipulation, specifically time filter manipulation. For
example, to compute YTD sales for September 2017 (the third month of the fiscal year), all filters
on the Date table are removed and replaced with a new filter of dates commencing at the beginning
of the year (July 1, 2017) and extending through to the last date of the in-context date period
(September 30, 2017).

Note that many Time Intelligence functions are available in DAX to support common time filter
manipulations.

Task 2: Create a YoY growth measure

In this task, you will create a sales YoY growth measure.

1. Add an additional measure to the Sales table, based on the following expression:

DAX

Sales YoY Growth =


VAR SalesPriorYear =
CALCULATE(
SUM(Sales[Sales]),
PARALLELPERIOD(
'Date'[Date],
-12,

232
MONTH
)
)
RETURN
SalesPriorYear

The Sales YoY Growth measure formula declares a variable. Variables can be useful for
simplifying the formula logic, and more efficient when an expression needs to be evaluated multiple
times within the formula (which will be the case for the YoY growth logic). Variables are declared
by a unique name, and the measure expression must then be output after the RETURN keyword.

The SalesPriorYear variable is assigned an expression which calculates the sum of the Sales
column in a modified context that uses the PARALLELPERIOD() function to shift 12 months back
from each date in filter context.

2. Add the Sales YoY Growth measure to the matrix visual.

3. Notice that the new measure returns blank for the first 12 months (there were no sales recorded
before fiscal year 2017).

4. Notice that the Sales YoY Growth measure value for 2017 Jul is the Sales value for 2016 Jan.

Now that the “difficult part” of the formula has been tested, you can overwrite the measure with
the final formula which computes the growth result.

233
5. To complete the measure, overwrite the Sales YoY Growth measure with this formula, formatting
it as a percentage with two decimal places:

DAX

Sales YoY Growth =


VAR SalesPriorYear =
CALCULATE(
SUM(Sales[Sales]),
PARALLELPERIOD(
'Date'[Date],
-12,
MONTH
)
)
RETURN
DIVIDE(
(SUM(Sales[Sales]) - SalesPriorYear),
SalesPriorYear
)

6. In the formula, in the RETURN clause, notice that the variable is referenced twice.

7. Verify that the YoY growth for 2018 Jul is 392.83%.

234
This means that July 2018 sales ($2,411,559) represents a nearly 400% (almost 4x) improvement
over the sales achieved for the prior year ($489,328).

8. In Model view, place the two new measures into a display folder named Time Intelligence.

9. Save the Power BI Desktop file.

DAX includes many Time Intelligence functions to make it easy to implement time filter
manipulations for common business scenarios.

This exercise completes the data model development. In the next exercise, you will publish the
Power BI Desktop file to your workspace, ready for creating a report in the next lab.

Exercise 3: Publish the Power BI Desktop File


In this exercise, you will publish the Power BI Desktop file to Power BI.

Task 1: Publish the file

In this task, you will publish the Power BI Desktop file to Power BI.

235
1. Save the Power BI Desktop file.

If you’re not confident you completed this lab successfully, you should publish the Power BI
Desktop file found in the D:\DA100\Lab06B\Solution folder. In this case, close your current Power
BI Desktop file, and then open the solution file. First, perform a data refresh (using the Refresh
command on the ribbon), and then continue with the instructions in this task.

2. To publish the file, on the Home ribbon tab, from inside the Share group, click Publish.

3. In the Publish to Power BI window, select your Sales Analysis workspace.

It’s important that you publish it to the workspace you created in Lab 01A, and not “My
workspace”.

4. Click Select.

236
5. When the file has been successfully published, click Got It.

6. Close Power BI Desktop.

7. In the Edge, in the Power BI service, in the Navigation pane (located at the left), review the contents
of your Sales Analysis workspace.

The pubilication has added a report and a dataset. If you don’t see them, press F5 to reload the
browser, and then expand the workspace again.

The data model has been published to become a dataset. The report—used to test your model
calculations—has been added as a report. This report is not required, so you will now delete it.

237
8. Hover the cursor over the Sales Analysis report, click the vertical ellipsis (…), and then select
Remove.

9. When prompted to confirm the deletion, click Delete.

In the next lab, you will create a report based on the published dataset.

3) Graded Lab Tasks


Note: The instructor can design graded lab activities according to the level of difficult and
complexity of the solved lab activities. The lab tasks assigned by the instructor should be
evaluated in the same lab.
Lab Task 1
Download any dataset from internet and write DAX functions to perform different operation on data.

238
Lab 14
Designing a Report in Power BI Desktop

Objective:
The objective of this lab is to create a three-page report named Sales Report. You will then publish it to
Power BI, whereupon you will open and interact with the report.

Activity Outcomes:
The activities provide hands - on practice with the following topics
• Use Power BI Desktop to create a live connection
• Design a report
• Configure visual fields and format properties

Instructor Note:
As pre-lab activity, read Chapter xx from the text book “”.

239
1) Useful Concepts

What are Power BI Reports?

Power BI reports are comprehensive and detailed pages that provide in-depth analysis and insights. They
offer more advanced functionalities compared to dashboards.

Reports enable users to dive deep into data, perform ad-hoc analysis, and explore multiple dimensions.
They provide a comprehensive view of data, allowing users to answer complex business questions. This
is made possible by offering interactive features such as drill-through, filtering, and highlighting. Users can
explore data further by interacting with the visualizations, uncovering deeper insights.

Since so many visuals and other elements are incorporated into reports, it is best to split them
across multiple pages or tabs, each containing different visualizations, tables, and interactive elements.
Users can navigate between pages to explore specific aspects of the data, by using buttons and bookmarks.

Another technique to declutter your reports is by making use of progressive disclosure: showing the bear
necessities as a starting point, but letting users uncover more details by letting them click on buttons that
reveal more detailed visuals. This gives the user a more app-like experience.

Because of their complexities, reports support advanced data modeling techniques. Users can create custom
measures, calculated columns, and use DAX (Data Analysis Expressions) to perform complex calculations,
allowing for performant and efficient displaying of visualizations.

2) Solved Lab Activites


Sr.No Allocated Time Level of Complexity CLO Mapping
1 10 Low CLO-6
2 10 Low CLO-6
3 20 Medium CLO-6
4 10 Low CLO-6

Activity 1:
Create a Report

Solution:

240
In this exercise, you will create a three-page report named Sales Report.

Task 1: Create a new file

In this task, you will create a live connection to the Sales Analysis dataset.

1. To open the Power BI Desktop, on the taskbar, click the Microsoft Power BI Desktop shortcut.

2. At the top-right corner of the welcome screen, click X.

3. Click the File ribbon tab to open the backstage view, and then select Save.

4. In the Save As window, navigate to the D:\DA100\MySolution folder.

5. In the File Name box, enter Sales Report.

241
6. Click Save.

Task 2: Create a live connection

In this task, you will create a live connection to the Sales Analysis dataset.

1. To create a live connection, on the Home ribbon tab, from inside the Data group, click Get Data,
down-arrow, and then select Power BI Datasets.

2. In the Select a Dataset to Create a Report window, select the Sales Analysis dataset.

242
3. Click Create.

4. At the bottom-right corner, in the status bar, notice that the live connection has been established.

5. In the Fields pane, notice that the data model table are listed.

Power BI Desktop can no longer be used to develop the data model; in live connection mode, it’s
only a report authoring tool. It is possible, however, to create measures—but they are measures
that are only available within the report. You won’t add any reportscoped measures in this lab.

6. Save the Power BI Desktop file.

Recall that you added the Salespeople role to the model in Lab 05A. Because you’re the owner of
the Power BI dataset, the roles are not enforced. This explains why, in this lab, you can see all
data.

Task 3: Design page 1

In this task, you will design the first report page. When you’ve completed the design, the page will look
like the following:

243
1. To rename the page, at the bottom-left, right-click Page 1, and then select Rename.

Tip: You can also double-click the page name.

2. Rename the page as Overview, and then press Enter.

3. To add an image, on the Insert ribbon tab, from inside the Elements group, click Image.

4. In the Open window, navigate to the D:\DA100\Data folder.

244
5. Select the AdventureWorksLogo.jpg file, and then click Open.

6. Drag the image to reposition it at the top-left corner, and also drag the guide markers to resize it.

7. To add a slicer, first de-select the image by clicking an empty area of the report page.

8. In the Fields pane, select the Date | Year field (not the Year level of the hierarchy).
9. Notice that a table of year values has been added to the report page.

10. To convert the visual from a table to a slicer, in the Visualizations pane, select the Slicer.

11. To convert the slicer from a list to a dropdown, at the top-right of the slicer, click the down-arrow,
and then select Dropdown.

245
12. Resize and reposition the slicer so it sits beneath the image, and so it is the same width as the image.

13. In the Year slicer, select FY2020, and then collapse the dropdown list.

The report page is now filtered by year FY2020.

14. De-select the slicer by clicking an empty area of the report page.

15. Create a second slicer, based on the Region | Region field (not the Region level of the hierarch).

16. Leave the slicer as a list, and then resize and reposition the slicer beneath the Year slicer.

246
17. To format the slicer, beneath the Visualizations pane, open the Format pane.

18. Expand then Selection Controls group.

19. Set the Show “Select All” Option to On.

247
20. In the Region slicer, notice that the first item is now Select All.

When selected, this item either selects all, or de-selects all items. It makes it easier for report users
to set the right filters.

21. De-select the slicer by clicking an empty area of the report page.

22. To add a chart to the page, in the Visualizations pane, click the Line and Stacked Column Chart
visual type.

23. Resize and reposition the visual so it sits to the right of the logo, and so it fills the width of the
report page.

248
24. Drag the following fields into the visual:

o Date | Month o
Sales | Sales

25. In the visual fields pane (not the Fields pane—the visual fields pane is located beneath the
Visualizations pane), notice that the fields are assigned to the Shared Axis and Column Values
wells.

By dragging visuals into a visual, they will be added to default wells. For precision, you can drag
fields directly into the wells, as you will do now.

26. From the Fields pane, drag the Sales | Profit Margin field into the Line Values well.

249
27. Notice that the visual has 11 months only.

The last month of the year, 2020 June, does not have any sales (yet). By default, the visual has
eliminated months with BLANK sales. You will now configure the visual to show all months.

28. In the visual fields pane, in the Shared Axis well, for the Month field, click the downarrow, and
then select Show Items With No Data.

29. Notice that the month 2020 June now appears.

30. De-select the chart by clicking an empty area of the report page.

31. To add a chart to the page, in the Visualizations pane, click the Map visual type.

250
32. Resize and reposition the visual so it sits beneath the column/line chart, and so it fills half the width
of the report page.

33. Add the following fields to the visual wells:

o Location: Region | Country o


Legend: Product | Category o
Size: Sales | Sales

34. De-select the chart by clicking an empty area of the report page.

35. To add a chart to the page, in the Visualizations pane, click the Stacked Bar Chart visual type.

251
36. Resize and reposition the visual so it fills the remaining report page space.

37. Add the following fields to the visual wells:

o Axis: Product | Category o


Value: Sales | Quantity

38. To format the visual, open the Format pane.

39. Expand the Data Colors group, and then set the Default Color property to a suitable color (in
contrast to the column/line chart).

40. Set the Data Labels property to On.

41. Save the Power BI Desktop file.

252
The design of the first page is now complete.

Task 4: Design page 2

In this task, you will design the second report page. When you’ve completed the design, the page will look
like the following:

When detailed instructions have already been provided in the labs, the lab steps will now provide more
concise instructions. If you need the detailed instructions, you can refer back to other tasks.

1. To create a new page, at the bottom-left, click the plus icon.

2. Rename the page to Profit.

253
3. Add a slicer based on the Region | Region field.

4. Use the Format pane to enable the “Select All” option (in the Selection Controls group).

5. Resize and reposition the slicer so it sits at the left side of the report page, and so it is about half the
page height.

6. Add a matrix visual, and resize and reposition it so it fills the remaining space of the report page

7. Add the Date | Fiscal hierarchy to the matrix Rows well.

8. Add the following five Sales table fields to the Values well:

254
o Orders (from the Counts folder) o
Sales o Cost o Profit o Profit Margin

9. In the Filters pane (located at the left of the Visualizations pane), notice the Filter On This Page
well (you may need to scroll down).

10. From the Fields pane, drag the Product | Category field into the Filter On This Page well.

11. Inside the filter card, at the top-right, click the arrow to collapse the card.

255
Fields added to the Filters pane can achieve the same result as a slicer. One difference is they
don’t take up space on the report page. Another difference is that they can be configured for more
advanced filtering requirements.

12. Add each of the following Product table fields to the Filter On This Page well, collapsing each,
directly beneath the Category card:

o Subcategory
o Product o
Color

256
13. To collapse the Filters pane, at the top-right of the pane, click the arrow.

14. Save the Power BI Desktop file.

The design of the second page is now complete.

Task 5: Design page 3

In this task, you will design the third and final report page. When you’ve completed the design, the page
will look like the following:

1. Create a new page, and then rename it as My Performance.

Recall that row-level security was configured to ensure users only ever see data for their sales
regions and targets. When this report is distributed to salespeople, they will only ever see their
sales performance results.

2. To simulate the row-level security filters during report design and testing, add the Salesperson
(Performance) | Salesperson field to the Filters pane, inside the Filters On This Page well.

3. In the filter card, scroll down the list of salespeople, and then check Michael Blythe.
257
You will be instructed to delete this filter before you distribute the report in an app in Lab 12A.

4. Add a dropdown slicer based on the Date | Year field, and then resize and reposition it so it sits at
the top-left corner of the page.

5. In the slicer, select FY2019.

6. Add a Multi-row Card visual, and then resize and reposition it so it sits to the right of the slicer
and fills the remaining width of the page.

258
7. Add the following four fields to the visual:

o Sales | Sales o Targets |


Target o Targets | Variance o
Targets | Variance Margin

8. Format the visual:

o In the Data Labels group, increase the Text Size property to 28pt o In the
Background group, set the Color to a light gray color

9. Add a Clustered Bar Chart visual, and then resize and reposition it so it sits beneath the multi-
row card visual and fills the remaining height of the page, and half the width of the multi-row card
visual.

259
10. Add the following fields to the visual wells: o Axis: Date | Month o Value: Sales | Sales and
Targets | Target

260
11. To create a copy of the visual, press Ctrl+C, and then press Ctrl+V.

12. Position the copied visual to the right of the original visual.

261
13. To modify the visualization type, in the Visualizations pane, select Clustered Column Chart.

It’s now possible to see the same data expressed by two different visualization types. This isn’t a
good use of the page layout, but you will improve it in Lab 09A by superimposing the visuals. By
adding buttons to the page, you will allow the report user to determine which of the two visuals is
visible.

The design of the third and final page is now complete.

Task 6: Publish the report

In this task, you will publish the report.

1. Select the Overview page.

2. Save the Power BI Desktop file.

3. On the Home ribbon tab, from inside the Share group, click Publish.

262
4. Publish the report to your Sales Analysis workspace.

5. Leave Power BI Desktop open.

In the next exercise, you will explore the report in the Power BI service.

Activity 2:
Explore the Report

Solution:
In this exercise, you will explore the Sales Report in the Power BI service.

Task 1: Explore the report

In this task, you will explore the Sales Report in the Power BI service.

1. In the Edge, in the Power BI service, in the Navigation pane, review the contents of your Sales
Analysis workspace, and then click the Sales Report report.

The report publication has added a report to your workspace. If you don’t see it, press F5 to
reload the browser, and then expand the workspace again.
263
2. In the Regions slicer, while pressing the Ctrl key, select multiple regions.

3. In the column/line chart, select any month column to cross filter the page.

4. While pressing the Ctrl key, select an additional month.

By default, cross filtering filters the other visuals on the page.

5. Notice that the bar chart is filtered and highlighted, with the bold portion of the bars representing
the filtered months.

6. Hover the cursor over the visual, and then at the top-right, click the filter icon.

The filter icon allows you to understand all filters that are applied to the visual, including slicers
and cross filters from other visual.

7. Hover the cursor over a bar, and then notice the tooltip information.

8. To undo the cross filter, in the column/line chart, click an empty area of the visual.

9. Hover the cursor over the map visual, and then at the top-right, click the In Focus icon.

In focus mode zooms the visual to full page size.

10. Hover the cursor over different segments of the pie charts to reveal tooltips.
11. To return to the report page, at the top-left, click Back to Report.

12. Hover the cursor over the map visual again, and then click the ellipsis (…), and notice the menu
options.

264
13. Try out each of the options.

14. At the left, in the Pages pane, select the Profit page.

15. Notice that the Region slicer has a different selection to the Region slicer on the Overview page.
The slicers are not synchronized. In the next lab, you will modify the report design to ensure they
sync between pages.

16. In the Filters pane (located at the right), expand a filter card, and apply some filters. The Filters
pane allows you to define more filters than could possibly fit on a page as slicers.

17. In the matrix visual, use the plus (+) button to expand into the Fiscal hierarchy.
18. Select the My Performance page.

265
19. At the top-right on the menu bar, click View, and then select Full Screen.

20. Interact with the page by modifying the slicer, and cross filtering the page.

21. At the bottom-left, notice the commands to change page, navigate backwards or forwards between
pages, or to exit full screen mode.

22. Exit full screen mode.

23. To return to the workspace, in the breadcrumb trail, click your workspace name.

24. Leave the Edge browser window open.

Activity 3:
Configure Sync Slicers

Solution:
In this exercise, you will sync the report page slicers.

Task 1: Sync slicers

In this task, you will sync the Year and Region slicers.

You will continue the development of the report that you commenced designing in Lab 08A.

266
1. In Power BI Desktop, in the Sales Report file, on the Overview page, set the Year slicer to
FY2018.

2. Go to the My Performance page, and then notice that the Year slicer is a different value.

When slicers aren’t synced, it can contribute to misrepresentation of data and frustration for report
users. You’ll now sync the report slicers.

3. Go to the Overview page, and then select the Year slicer.

4. On the View ribbon tab, from inside the Show Panes group, click Sync Slicers.

5. In the Sync Slicers pane (at the left of the Visualizations pane), in the second column (which
represents syncing), check the checkboxes for the Overview and My Performance pages.

6. On the Overview page, select the Region slicer.

7. Sync the slicer with the Overview and Profit pages.

267
8. Test the sync slicers by selecting different filter options, and then verifying that the synced slicers
filter by the same options.

9. To close the Sync Slicer page, click the X located at the top-right of the pane.

Configure Drill Through


In this exercise, you will create a new page and configure it as a drill through page. When you’ve completed
the design, the page will look like the following:

268
Task 1: Create a drill through page

In this task, you will create a new page and configure it as a drill through page.

1. Add a new report page named Product Details.

2. Right-click the Product Details page tab, and then select Hide Page.

Report users won’t be able to go to the drill through page directly. They’ll need to access it from
visuals on other pages. You’ll learn how to drill through to the page in the final exercise of this
lab.

3. Beneath the Visualizations pane, in the Drill Through section, add the Product | Category field
to the Add Drill-Through Fields Here box.

269
4. To test the drill through page, in the drill through filter card, select Bikes.

5. At the top-left of the report page, notice the arrow button.

The button was added automatically. It allows report users to navigate back to the page from which
they drilled through.

6. Add a Card visual to the page, and then resize and reposition it so it sits to the right of the button
and fills the remaining width of the page.

270
7. Drag the Product | Category field into the card visual.

8. Configure the format options for the visual, and then turn the Category Label property to Off.

9. Set the Background Color property to a light shade of gray.

10. Add a Table visual to the page, and then resize and reposition it so it sits beneath the card visual
and fills the remaining space on the page.

271
11. Add the following fields to the visual:

o Product | Subcategory o
Product | Color o Sales |
Quantity o Sales | Sales o
Sales | Profit Margin

12. Configure the format options for the visual, and in the Grid section, set the Text Size property to
20pt.

The design of the drill through page is almost complete. In the next exercise, you’ll define
conditional formatting.

Activity 4:
Add Conditional Formatting

Solution:
In this exercise, you will enhance the drill through page with conditional formatting. When you’ve
completed the design, the page will look like the following:

272
Task 1: Add conditional formatting

In this task, you will enhance the drill through page with conditional formatting.

1. Select the table visual.

2. In the visual fields pane, for the Profit Margin field, click the down-arrow, and then select
Conditional Formatting | Icons.

273
3. In the Icons – Profit Margin window, in the Icon Layout dropdown list, select Right of Data.

4. To delete the middle rule, at the left of the yellow triangle, click X.

274
5. Configure the first rule (red diamond) as follows:

o In the second control, remove the value o In


the third control, select Number o In the fifth
control, enter 0 o In the sixth control, select
Number

6. Configure the second rule (green circle) as follows:

o In the second control, enter 0 o In the third


control, select Number o In the fifth
control, remove the value o In the sixth
control, select Number

The rules are as follows: display a red diamond if the profit margin value is less than 0; otherwise
if the value is great or equal to zero, display the green circle.

7. Click OK.

8. In the table visual, verify that the that the correct icons are displayed.

275
9. Configure background color conditional formatting for the Color field.

10. In the Background Color – Color window, in the Format By dropdown list, select Field Value.

11. In the Based on Field dropdown list, select Product | Formatting | Background Color Format.

276
12. Click OK.

13. Repeat the previous steps to configure font color conditional formatting for the Color field, using
the Product | Formatting | Font Color Format field

Add Bookmarks and Buttons


In this exercise, you will enhance the My Performance page with buttons, allowing the report user to select
the visual type to display. When you’ve completed the design, the page will look like the following:

277
Task 1: Add bookmarks

In this task, you will add two bookmarks, one to display each of the monthly sales/targets visuals.

1. Go to the My Performance page.

2. On the View ribbon tab, from inside the Show Panes group, click Bookmarks.

3. On the View ribbon tab, from inside the Show Panes group, click Selection.

4. In the Selection pane, beside one of the Sales and Target by Month items, to hide the visual, click
the eye icon.

278
5. In the Bookmarks pane, click Add.

6. To rename the bookmark, double-click the bookmark.

7. If the visible chart is the bar chart, rename the bookmark as Bar Chart ON, otherwise rename the
bookmark as Column Chart ON.

8. In the Selection pane, toggle the visibility of the two Sales and Target by Month items.

In other words, make the visible visual hidden, and make the hidden visual visible.

279
9. Create a second bookmark, and name it appropriately (either Column Chart ON or Bar Chart
ON).

10. In the Selection pane, to make both visuals visible, simple show the hidden visual.

11. Resize and reposition both visuals so they fill the page beneath the multi-card visual, and
completely overlap one another.

Tip: To select the visual that is covered up, select it in the Selection pane.

12. In the Bookmarks pane, select each of the bookmarks, and notice that only one of the visuals is
visible.

The next stage of design is to add two buttons to the page, which will allow the report user to select
the bookmarks.

Task 2: Add buttons

In this task, you will add two buttons, and assign bookmark actions to each.

1. On the Insert ribbon, from inside the Elements group, click Button, and then select Blank.

280
2. Reposition the button directly beneath the Year slicer.

3. Select the button, and the in the Visualizations pane, turn the Button Text property to On.

4. Expand the Button Text section, and then in the Button Text box, enter Bar Chart.

5. Format the background color, using a suitable color.

6. Turn the Action property to On (located near the bottom of the list).

7. Expand the Action section, and then set the Type dropdown list to Bookmark.
281
8. In the Bookmark dropdown list, select Bar Chart ON.

9. Create a copy of the button by using copy and paste, and then configure the new button as follows:

o Set the Button Text property to Column Chart o In the Action section, set the Bookmark
dropdown list to Column Chart ON

Task 3: Publish the report

In this task, you will publish the report.

1. Select the Overview page.

2. In the Year slicer, select FY2020.

3. In the Region slicer, select Select All.

4. Save the Power BI Desktop file.

5. Publish the report to your Sales Analysis workspace.

6. When prompted to replace the report, click Replace.

7. Leave Power BI Desktop open.

In the next exercise, you will explore the report in the Power BI service.

282
Explore the Report
In this exercise, you will explore the Sales Report in the Power BI service.

Task 1: Explore the report

In this task, you will explore the Sales Report in the Power BI service.

1. In the Edge, in the Power BI service, open the Sales Report report.
2. To test the drill through report, in the Quantity by Category visual, right-click the Clothing bar,
and then select Drill Through | Product Details.

3. Notice that the Product Details page is for Clothing.

4. To return to the source page, at the top-left corner, click the arrow button.

5. Select the My Performance page.

6. Click each of the buttons, and then notice that a different visual is displayed.

Finish up

In this task, you will complete the lab.

1. To return to the workspace, in the breadcrumb trail, click your workspace name.

2. Leave the Edge browser window open.

283
3. In Power BI Desktop, go to the My Performance page, and in the Fields pane, remove the
Salesperson filter card.

4. Select the Overview page.

5. Save the Power BI Desktop file, and then republish to the Sales Analysis workspace.

6. Close Power BI Desktop.

3) Graded Lab Tasks


Note: The instructor can design graded lab activities according to the level of difficult and
complexity of the solved lab activities. The lab tasks assigned by the instructor should be
evaluated in the same lab.
Lab Task 1
Create calculated measures such as Total Sales, Average Revenue, etc.Design a report layout by adding
visuals like bar charts, line charts, and tables. Format visuals to improve readability and aesthetics. Add
filters and slicers to interactively analyze data.

284
Lab 15
Creating a Power BI Dashboard and Data Analysis

Objective:
The objective of this lab is create dashboards in PowerBI.
Activity Outcomes:
The activities provide hands - on practice with the following topics

• Pin visuals to a dashboard


• Use Q&A to create dashboard tiles
• Configure a dashboard tile alert
• Create animated scatter charts
• Use a visual to forecast values
• Work with the decomposition tree visual
• Work with the key influences visual

Instructor Note:
As pre-lab activity, read Chapter xx from the textbook “Business Intelligence Guidebook: From Data
Integration to Analytics, Rick Sherman, Morgan Kaufmann Press, 2014”.

285
1) Useful Concepts

What are Power BI Dashboards?

Dashboards, in the context of Power BI, are visual displays that provide a consolidated view of data. They
allow users to monitor key metrics, track performance, and gain high-level insights at a glance.

In general, dashboards are designed to display data in real-time or near-real-time. They can connect to
various data sources, including databases, cloud services, and streaming data, providing up-to-date
information.

While different visualizations can be used in dashboards, they focus mainly on charts, graphs, gauges, and
cards. These visual elements represent key performance indicators (KPIs) and provide a quick overview
of business metrics. Therefore, dashboards are typically limited to a single page, allowing users to see
multiple visualizations at once. This simplicity helps users quickly grasp the overall performance of their
business.

In addition, dashboard interactivity is limited to keep things simple.

Publishing a dashboard to Power BI Service even allows developers to fix certain filters or visual elements
for a set of end users.

2) Solved Lab Activites


Sr.No Allocated Time Level of Complexity CLO Mapping
1 10 Low CLO-6
2 10 Low CLO-6
3 20 Medium CLO-6
4 10 Low CLO-6

Activity 1:
Create a Dashboard

Solution:
In this exercise, you will create the Sales Monitoring dashboard. The completed dashboard will look like
the following:

286
Task 1: Create a dashboard

In this task, you will create the Sales Monitoring dashboard.

1. In Edge, in the Power BI service, open the Sales Report report.

2. To create a dashboard and pin the logo image, hover the cursor over the Adventure Works logo.

3. At the top-right corner, click the pushpin.

4. In the Pin to Dashboard window, in the Dashboard Name box, enter Sales Monitoring.

287
5. Click Pin.

6. Set the Year slicer to FY2020.

7. Set the Region slicer to Select All.

When pinning visuals to a dashboard, they will use the current filter context. Once pinned, the filter
context cannot be changed. For time-based filters, it’s a better idea to use a relative date slicer (or,
Q&A using a relative time-based question).

8. Pin the Sales and Profit Margin by Month (column/line) visual to the Sales Monitoring
dashboard.

9. Open the Navigation pane, and then open the Sales Monitoring dashboard.

288
10. Notice that the dashboard has two tiles.

11. To resize the logo tile, drag the bottom-right corner, and resize the tile to become one unit wide,
and two units high.

Tile sizes are constrained into a rectangular shape. It’s only possible to resize into multiples of the
rectangular shape.

12. To add a tile based on a question, at the top-left of the dashboard, click Ask a Question About
Your Data.

13. You can use the Q&A feature to ask a question, and Power BI will respond will a visual.

289
14. Click any one of the suggested questions beneath the Q&A box, in gray boxes.

15. Review the response.


16. Remove all text from the Q&A box.

17. In the Q&A box, enter the following: Sales YTD

18. Notice the response of (Blank).

Recall you added the Sales YTD measure in Lab 06B. This measure is Time Intelligence expression
and it requires a filter on the Date table to produce a result.

19. Extend the question with: in year FY2020.

20. Notice the response is now $33M.

290
21. To pin the response to the dashboard, at the top-right corner, click Pin Visual.

22. When prompted to pin the tile to the dashboard, click Pin.

There’s a possible bug that will only allow you to pin to a new dashboard. It’s because your Power
BI session has reverted to your “My Workspace”. If this happens, do not pin to a new dashboard.
Return to your Sales Analysis workspace, open the dashboard again, and recreate the Q&A
question.

23. To return to the dashboard, at the top-left corner, click Exit Q&A.

Task 2: Edit tile details

In this task, you will edit the details of two tiles.

1. Hover the cursor over the Sales YTD tile, and then at the top-right of the tile, click the ellipsis, and
then select Edit Details.

291
2. In the Tile Details pane (located at the right), in the Subtitle box, enter FY2020.

3. At the bottom of the pane, click Apply.

292
4. Notice that the Sales YTD tile displays a subtitle.

5. Edit the tile details for the Sales, Profit Margin tile.

6. In the Tile Details pane, in the Functionality section, check Display Last Refresh Time.

7. Click Apply.

8. Notice that the tile describes the last refresh time (which you did when refreshing the data model
in Power BI Desktop).

Later in this lab, you’ll simulate a data refresh, and notice that the refresh time updates.

Task 3: Configure an alert

In this task, you will configure a data alert.

1. Hover the cursor over the Sales YTD tile, click the ellipsis, and then select Manage Alerts.

293
2. In the Manage Alerts pane (located at the right), click Add Alert Rule.

3. In the Threshold box, replace the value with 35000000 (35 million).

294
This configuration will ensure you’re notified whenever the tile updates to a value above 35 million.

4. At the bottom of the pane, click Save and Close.

In the next exercise, you’ll refresh the dataset. Typically, this should be done by using scheduled
refresh, and Power BI could use a gateway to connect to the SQL Server database. However, due
to constraints in the classroom setup, there is no gateway. So, you’ll opening Power BI Desktop,
perform a manual data refresh, and the upload the file.

Refresh the Dataset


In this exercise, you will first load sales order data for June 2020 into the AdventureWorksDW2020
database, and then add your classroom partner’s account to the database.. You will then open your Power
BI Desktop file, perform a data refresh, and then upload the file to your Sales Analysis workspace.

Task 1: Update the lab database

In this task, you will run a PowerShell script to update data in the
AdventureWorksDW2020 database.

295
1. In File Explorer, inside the D:\DA100\Setup folder, right-click the UpdateDatabase2-
AddSales.ps1 file, and then select Run with PowerShell.

xxxvii. When prompted to press any key to continue, press Enter again.

The AdventureWorksDW2020 database now includes sales orders June 2020.

2. Inside the D:\DA100\Setup folder, right-click the UpdateDatabase-3AddPartnerAccount.ps1


file, and then select Run with PowerShell.

3. When prompted, enter the account name of your classroom partner, and then press Enter.

You only need to enter their account name (all characters before the @ symbol). Choose somebody
sitting near you—you will work together in pairs to complete Lab 12A, which covers sharing Power
BI content.

Their account name is added so you can test the row-level security. You partner is now Pamela
Ansam-Wolfe, whose sales performance is measured by the sales of two sales territory regions: US
Northwest and US Southwest.

Task 2: Refresh the Power BI Desktop file

In this task, you will open the Sales Analysis Power BI Desktop file, perform a data refresh, and then upload
the file to your Sales Analysis workspace.

1. Open your Sales Analysis Power BI Desktop file, stored in the D:\DA100\MySolution folder.

296
When the file was published in Lab 06B, if you weren’t confident you completed the lab successfully
you were advised to upload the solution file instead. If you uploaded the solution file, be sure now
to open the solution file again now. It’s located in the D:\DA100\Lab06B\Solution folder.

2. On the Home ribbon, from inside the Queries group, click Refresh.

3. When the refresh completes, save the Power BI Desktop file.

4. Publish the file to your Sales Analysis workspace.

5. When prompted to replace the dataset, click Replace.

The dataset in the Power BI service now has June 2020 sales data.

6. Close Power BI Desktop.

7. In Edge, in the Power BI service, in your Sales Analysis workspace, notice that the Sales Analysis
report was also published.

This report was used to test the model a you developed it in Lab 05A and Lab 06A.

8. Remove the Sales Analysis report (not dataset).

Review the Dashboard


In this exercise, you will review the dashboard to notice updated sales, and that the alert was triggered.

Task 1: Review the dashboard

In this task, you will review the dashboard to notice updated sales, and that the alert was triggered.

1. In Edge, in the Power BI service, open the Sales Monitoring dashboard.

297
2. In the Sales, Profit Margin tile, in the subtitle, notice that the data was refreshed NOW.

3. Notice also that there is now a column for 2020 Jun.

The alert on the Sales YTD tile should have triggered also. After a short while, the alert should
notify you that sales now exceeds the configured threshold value.
4. Notice that the Sales YTD tile has updated to $37M.

5. Verify that the Sales YTD tile displays an alert notification icon.

If you don’t see the notification, you might need to press F5 to reload the browser. If you still don’t
see the notification, wait some minutes longer.

298
Alert notifications appear on the dashboard tile, and can be delivered by email, and push
notifications to mobile apps including the Apple Watch8.

Activity 2:
Data Analysis in Power BI Desktop

Solution:

Task 1: Create the report

In this task, you will create the Sales Exploration report.

1. Open the Power BI Desktop, and dismiss the welcome screen.

2. Save the file to the D:\DA100\MySolution folder, as Sales Exploration.

3. Create a live connection to your Sales Analysis dataset.

Tip: Use the Get Data command on the Home ribbon tab, and then select Power BI Datasets.

299
You now create four report pages, and on each page you’ll work with a different visual to analyze
and explore data.

Create a Scatter Chart


In this exercise, you will create a scatter chart that can be animated.

Task 1: Create an animated scatter chart

In this task, you will create a scatter chart that can be animated.

1. Rename Page 1 as Scatter Cha****rt.

300
2. Add a Scatter Chart visual to the report page, and then reposition and resize it so it fills the entire
page.

3. Add the following fields to the visual wells: o Legend: Reseller | Business Type o X Axis: Sales |
Sales o Y Axis: Sales | Profit Margin o Size: Sales | Quantity o Play Axis: Date | Quarter

301
The chart can be animated when a field is added to the Play Axis well.

4. In the Filters pane, add the Product | Category field to the Filters On This Page well.

5. In the filter card, filter by Bikes.

302
6. To animate the chart, at the bottom left corner, click Play.

7. Watch the entire animation cycle from FY2018 Q1 to FY2020 Q4.

The scatter chart allows understanding the measure values simultaneously: in this case, order
quantity, sales revenue, and profit margin.

Each bubble represents a reseller business type. Changes in the bubble size reflect increased or
decreased order quantities. While horizontal movements represent increases/decreases in sales
revenue, and vertical movements represent increases/decreases in profitability.

8. When the animation stops, click one of the bubbles to reveal its tracking over time.

9. Hover the cursor over any bubble to reveal a tooltip describing the measure values for the reseller
type at that point in time.

10. In the Filters pane, filter by Clothing only, and notice that it produces a very different result.

11. Save the Power BI Desktop file.

Activity 3:
Create a Forecast

Solution:
In this exercise, you will create a forecast to determine possible future sales revenue.

Task 1: Create a forecast

In this task, you will create a forecast to determine possible future sales revenue.

1. Add a new page, and then rename the page to Forecast.

303
2. Add a Line Chart visual to the report page, and then reposition and resize it so it fills the entire
page.

3. Add the following fields to the visual wells:

o Axis: Date | Date o


Values: Sales | Sales

304
4. In the Filters pane, add the Date | Year field to the Filters On This Page well.

5. In the filter card, filter by two years: FY2019 and FY2020.

When forecasting over a time line, you will need at least two cycles (years) of data to produce an
accurate and stable forecast.

6. Add also the Product | Category field to the Filters On This Page well, and filter by Bikes.

305
7. To add a forecast, beneath the Visualizations pane, select the Analytics pane.

8. Expand the Forecast section.

If the Forecast section is not available, it’s probably because the visual hasn’t been correctly
configured. Forecasting is only available when two conditions are met: the axis has a single field
of type date, and there’s only one value field.

9. Click Add.

306
10. Configure the following forecast properties:

o Forecast length: 1 month o


Confidence interval: 80% o
Seasonality: 365

11. Click Apply.

12. In the line visual, notice that the forecast has extended one month beyond the history data.

The gray area represents the confidence. The wider the confidence, the less stable—and therefore
the less accurate—the forecast is likely to be.

When you know the length of the cycle, in this case annual, you should enter the seasonality points.
Sometimes it could be weekly (7), or monthly (30).

13. In the Filters pane, filter by Clothing only, and notice that it produces a different result.

14. Save the Power BI Desktop file.

307
Work with a Decomposition Tree
In this exercise, you will create a decomposition tree to explore the relationships between reseller geography
and profit margin.

Task 1: Work with a decomposition tree

In this task, you will create a decomposition tree to explore the relationships between reseller geography
and profit margin.

1. Add a new page, and then rename the page to Decomposition Tree.

2. On the Insert ribbon, from inside the AI Visuals group, click Decomposition Tree.
Tip: The AI visuals are also available in the Visualizations pane.

3. Reposition and resize the visual so it fills the entire page.

4. Add the following fields to the visual wells:

o Analyze: Sales | Profit Margin o Explain By: Reseller |


Geography (the entire hierarchy)

308
5. In the Filters pane, add the Date | Year field to the Filters On This Page well, and set the filter to
FY2020.

6. In the decomposition tree visual, notice the root of the tree: Profit Margin at -0.94%

7. Click the plus icon, and in the context menu, select High Value.

309
8. Notice that the decision tree presents resellers, ordered from highest profit margin.

9. To remove the level, at the top of visual, beside the Reseller label, click X.

10. Click the plus icon again, and then expand to the Country-Region level.

11. Expand from the United States to the State-Province level.

12. Use the down-arrow located at the bottom of the visual for State-Province, and then scroll to the
lower profitable states.
13. Notice that New York state has negative profitability.

14. Expand from New York to the Reseller level.

15. Notice that it is easy to isolate root cause.

310
United States is not producing profit in FY2020. New York is one state not achieving positive
profit, and it’s due to four resellers paying less than standard costs for their goods.

16. Save the Power BI Desktop file.

Exercise 5: Work with Key Influencers


In this exercise, you will use the Key Influencers AI visual to determine what influences profitability within
reseller business types and geography.

Task 1: Work with key influencers


In this task, you will use the Key Influencers AI visual to determine what influences profitability within
reseller business types and geography.

311
1. Add a new page, and then rename the page to Key Influencers.

2. On the Insert ribbon, from inside the AI Visuals group, click Key Influencers.

Tip: The AI visuals are also available in the Visualizations pane.

3. Reposition and resize the visual so it fills the entire page.

4. Add the following fields to the visual wells:

o Analyze: Sales | Profit Margin

o Explain By: Reseller | Business Type and Reseller | Geography (the entire hierarchy)
o Expand By: Sales | Quantity

312
5. At the top-left of the visual, notice that Key Influencers is in focus, and the specific influence is
set to understand what includes profit margin to increase.

6. Review the result, which is that the city of Bothel is more likely to increase.

7. Modify the target to determine what influences profit margin to decrease.

8. Review the result.

9. To detect segments, at the top-left, select Top Segments.

313
10. Notice that the target is now to determine segments when profit margin is likely to be high.
11. When the visual displays the segments (as circles), click one of them to reveal information about
it.

12. Review the segment results.

3) Graded Lab Tasks


Note: The instructor can design graded lab activities according to the level of difficult and
complexity of the solved lab activities. The lab tasks assigned by the instructor should be
evaluated in the same lab.
Lab Task 1
Take any business-related dataset, prepare a suitable dashboard to display maximum insights and perform
analysis on that dataset.

314
Lab 16
Get Started with Tableau Desktop -part1
Objective:
The objective of this lab is to get an introduction with the Tableau environment and to perform few task to get
insights from any dataset.
Activity Outcomes:
The activities provide hands - on practice with the following topics
• how to connect with the data
• how to generate basic charts
• adding filters in the view
• Adding colours in view
Instructor Note:
As pre-lab activity, read Chapter xx from the textbook “Business Intelligence Guidebook: From Data
Integration to Analytics, Rick Sherman, Morgan Kaufmann Press, 2014”.
1) Useful Concepts

Start Page
The start page in Tableau Desktop is a central location from which you can do the following:
• Connect to your data
• Open your most recently used workbooks, and
• Discover and explore content produced by the Tableau community.

The start page consists of three panes: Connect, Open, and Discover.

Connect

Connect to data and open saved data sources.

316
On the Connect pane, you can do the following:

• Connect to data: Under To a File, connect to data stored in Microsoft Excel files, text files,
Access files, Tableau extract files, and statistical files, such as SAS, SPSS, and R. Under To a
Server, connect to data stored in databases like Microsoft SQL Server or Oracle. The server
names listed in this section change based on which servers you connect to and how often.

• Open saved data sources: Quickly open data sources that you have previously saved to your
My Tableau Repository directory. Also, Tableau provides sample saved data sources that you
can use to explore Tableau Desktop functionality. To follow along with examples in the
Tableau Desktop documentation, you'll usually use the Sample – Superstore data source.

Open

Open recent workbooks, pin workbooks to the start page, and explore accelerator workbooks.

317
On the Open pane, you can do the following:

• Open recently opened workbooks: When you open Tableau Desktop for the first time, this pane is
empty. As you create and save new workbooks, the most recently opened workbooks appear here. Click
the workbook thumbnail to open a workbook, or if you don't see a workbook thumbnail, click the Open
a Workbook link to find other workbooks that are saved to your computer.

• Pin workbooks: You can pin workbooks to the start page by clicking the pin icon that appears in the
top-left corner of the workbook thumbnail. Pinned workbooks always appear on the start page, even if
they weren't opened recently. To remove a recently opened or pinned workbook, hover over the
workbook thumbnail, and then click the "x" that appears. The workbook thumbnail is removed
immediately but will show again with your most recently used workbooks the next time you open
Tableau Desktop.

• Explore accelerators: Open and explore accelerator workbooks to see what you can do with
Tableau. Prior to 2022.2, these were called sample workbooks.

318
Discover

See popular views in Tableau Public, read blog posts and news about Tableau, and find training videos and
tutorials to help you get started.

Case-study for this Practice:


Suppose you are an employee for a large retail chain. Your manager just got the quarterly sales report and noticed
that sales seem better for some products than for others and profit in some areas is not doing as well as she had
expected. Your boss is interested in the bottom line: It's your job to look at overall sales and profitability to see
if you can find out what's driving these numbers.
She has also asked you to identify areas for improvement and present your findings to the team. The team can
explore your results and take action to improve sales and profitability for the company's product lines.
You'll use Tableau Desktop to build a simple view of your product data, map product sales and profitability by
region, build a dashboard of your findings, and then create a story to present. Then, you will share your findings
on the web so that remote team members can take a look.

319
2) Solved Lab Activites
Sr.No Allocated Time Level of Complexity CLO Mapping
1 10 Low CLO-6
2 10 Low CLO-6
3 20 Medium CLO-6
4 10 Low CLO-6

Activity 1:
Data connection and generating basic graphs.

Solution:

Step 1: Connect to your data:


Your manager has asked you to look into the overall sales and profitability for the company and to identify key
areas for improvement. You have a bunch of data, but you aren’t sure where to start.

Open Tableau Desktop and begin:

The first thing you see after you open Tableau Desktop is the Start page. Here, you select the connector (how
you will connect to your data) that you want to use.

The Tableau start page

The start page gives you several options to choose from:

320
1. Tableau icon. Click in the upper left corner of any page to toggle between the start page and the
authoring workspace.

2. Connect pane. Under Connect, you can:

• Connect to data that is stored in a file, such as Microsoft Excel, PDF, Spatial files, and more.
• Connect to data that is stored on Tableau Server, Microsoft SQL Server, Google Analytics, or another
server.
• Connect to a data source that you’ve connected to before.

Tableau supports the ability to connect to a wide variety of data stored in a wide variety of places. The Connect
pane lists the most common places that you might want to connect to, or click the More links to see more options.

3. Under Accelerators, view accelerator workbooks that come with Tableau Desktop. Prior to 2022.2, these
were called sample workbooks.

4. Under Open, you can open workbooks that you've already created.

5. Under Discover, find additional resources like video tutorials, forums, or the “Viz of the week” to get ideas
about what you can build.

In the Connect pane, under Saved Data Sources, click Sample - Superstore to connect to the sample data set.

After you select Sample - Superstore, your screen will look something like this:

321
The Sample - Superstore data set comes with Tableau. It contains information about products, sales, profits, and
so on that you can use to identify key areas for improvement within this fictitious company.
Visualization in Tableau is possible through dragging and dropping Measures and Dimensions onto these
different Shelves.

Rows and Columns : Represent the x and y-axis of your graphs / charts. Filter: Filters help you view a strained
version of your data. For example, instead of seeing the combined Sales of all the Categories, you can look at a
specific one, such as just Furniture.
Pages: Pages work on the same principle as Filters, with the difference that you can actually see the

322
changes as you shift between the Paged values. Remember that Rosling chart? You can easily make one of your own
using Pages. Marks: The Marks property is used to control the mark types of your data. You may choose to represent your data
using different shapes, sizes or text.

And finally, there is Show Me, the brain of Tableau!

When you drag and drop fields onto the visualization area, Tableau makes default graphs for you, as we shall see soon, but you can
change these by referring to the Show Me option.

Note: Not every graph can be made with any combination of Dimensions or Measures. Each graph has its own conditions for the
number and types of fields that can be used, which we shall discuss next.

Step 2: Drag and drop to take a first look:

Create a view
You set out to identify key areas for improvement, but where to start? With four years' worth of data, you decide
to drill into the overall sales data to see what you find. Start by creating a simple chart.

1. From the Data pane, drag Order Date to the Columns shelf.

Note: When you drag Order Date to the Columns shelf, Tableau creates a column for each year in
your data set. Under each column is an Abc indicator. This indicates that you can drag text or numerical
data here, like what you might see in an Excel spreadsheet. If you were to drag Sales to this area, Tableau
creates a crosstab (like a spreadsheet) and displays the sales totals for each year.

2. From the Data pane, drag Sales to the Rows shelf.

323
Tableau generates the following chart with sales rolled up as a sum (aggregated). You can see total
aggregated sales for each year by order date.

When you first create a view that includes time (in this case Order Date), Tableau automatically
generates a line chart.

This line chart shows that sales look pretty good and seem to be increasing over time. This is good information,
but it doesn't really tell you much about which products have the strongest sales and if there are some products
that might be performing better than others. Since you just got started, you decide to explore further and see
what else you can find out.

Check your work! Watch "Create a view" in action

Click the image to replay it.

324
Refine your view

To gain more insight into which products drive overall sales, try adding more data. Start by adding the
product categories to look at sales totals in a different way.

1. From the Data pane, drag Category to the Columns shelf and place it to the right of
YEAR(Order Date).

Your view updates to a bar chart. By adding a second discrete dimension to the view you can
categorize your data into discrete chunks instead of looking at your data continuously over time.
This creates a bar chart and shows you overall sales for each product category by year.

325
Your view is doing a great job showing sales by category—furniture, office supplies, and
technology. An interesting insight is revealed!

From this view, you can see that sales for furniture is growing faster than sales for office
supplies, even though Office Supplies had a really good year in 2021. Perhaps you can
recommend that your company focus sales efforts on furniture instead of office supplies? Your
company sells a lot of different products in those categories, so you'll need more information
before you can make a recommendation.

To help answer that question, you decide to look at products by sub-category to see which items
are the big sellers. For example, for the Furniture category, you want to see details about
bookcases, chairs, furnishings, and tables. Looking at this data might help you gain insights
into sales and later on, overall profitability, so add sub-categories to your bar chart.

2. Double-click or drag Sub-Category to the Columns shelf.

Note: You can drag and drop or double-click a field to add it to your view, but be careful.
Tableau makes assumptions about where to add that data, and it might not be placed where you
expect. You can always click Undo to remove the field, or drag it off the area where Tableau
placed it to start over.

Sub-Category is another discrete field. It creates another header at the bottom of the view, and
shows a bar for each sub-category (68 marks) broken down by category and year.

326
Now you are getting somewhere, but this is a lot of data to visually sort through. In the next section,
you will learn how you can add color, filters, and more to focus on specific results.

Check your work! Watch "Refine your view" in action

Click the image to replay it

327
Step summary
This step was all about getting to know your data and starting to ask questions about your data to gain
insights. You learned how to:

• Create a chart in a view that works for you.

• Add fields to get the right level of detail in your view.

Now you're ready to begin focusing on your results to identify more specific areas of concern. In the
next section, you will learn how to use filters and colors to help you explore your data visually.

Step 3: Focus your results:

You've created a view of product sales broken down by category and sub-category. You are starting to
get somewhere, but that is a lot of data to sort through. You need to easily find the interesting data
points and focus on specific results. Well, Tableau has some great options for that!

Filters and colors are ways you can add more focus to the details that interest you. After you add focus
to your data, you can begin to use other Tableau Desktop features to interact with that data.

Activity 2:
Add filters to your view

Solution:

You can use filters to include or exclude values in your view. In this example, you decide to add two
simple filters to your worksheet to make it easier to look at product sales by sub-category for a specific
year.

1. In the Data pane, right-click Order Date and select Show Filter.

2. Repeat the step above for the Sub-Category field.

The filters are added to the right side of your view in the order that you selected them. Filters
are card types and can be moved around on the canvas by clicking on the filter and dragging it
to another location in the view. As you drag the filter, a line appears that shows you where you
can drop the filter to move it.

Note: The Get Started tutorial uses the default position of the filter cards.

More on Filtering in the Learning Library.

328
Check your work! Watch "Apply filters to your view" in action

Click the image to replay it

329
Activity 3:
Add color to your view

Solution:

Adding filters helps you to sort through all of this data—but wow, that’s a lot of blue! It's time to do something
about that.

Currently, you are looking at sales totals for your various products. You can see that some products have
consistently low sales, and some products might be good candidates for reducing sales efforts for those product
lines. But what does overall profitability look like for your different products? Drag Profit to color to see what
happens.

From the Data pane, drag Profit to Color on the Marks card.

By dragging profit to color, you now see that you have negative profit in Tables, Bookcases, and even Machines.
Another insight is revealed!

330
Note: Tableau automatically added a color legend and assigned a diverging color palette because your data
includes both negative and positive values.

Step summary
In this step you used filter and color to make working with your data a bit easier. You also learned about a few
fun features that Tableau offers to help you answer key questions about your data. You learned how to:

• Apply filters and color to make it easier to focus on the areas of your data that interest you the most.

• Interact with your chart using the tools that Tableau provides.

• Duplicate worksheets and save your changes to continue exploring your data in different ways without
losing your work.

3) Graded Lab Tasks


Note: The instructor can design graded lab activities according to the level of difficult and complexity
of the solved lab activities. The lab tasks assigned by the instructor should be evaluated in the same
lab.
Lab Task 1

Download any dataset from Kaggle and generate basic graphs, add filters and colors to your graphs.

331
Lab 17

Working with Tableau -Part2

Objective:
The objective of this lab is to get an introduction with the Tableau environment and to perform few tasks to get
insights from any dataset.
Activity Outcomes:
The activities provide hands - on practice with the following topics
• Explore your data geographically
• Create a Top N Filter
• Building a dashboard
Instructor Note:
1. As pre-lab activity, read Chapter xx from the textbook “Business Intelligence Guidebook:
From Data Integration to Analytics, Rick Sherman, Morgan Kaufmann Press, 2014”.
”.

332
1) Useful Concepts

1) Solved Lab Activites


Sr.No Allocated Time Level of Complexity CLO Mapping
1 20 Low CLO-6
2 20 Medium CLO-6
3 20 Medium CLO-6

Activity 1:
Explore your data geographically

Solution:
You've built a great view that allows you to review sales and profits by product over several years. And after
looking at product sales and profitability in the South, you decide to look for trends or patterns in that region.

Because you're looking at geographic data (the Region field), you have the option to build a map view. Map
views are great for displaying and analyzing this kind of information. Plus, they're just cool!

For this example, Tableau has already assigned the proper geographic roles to the Country, State, City, and
Postal Code fields. That's because it recognized that each of those fields contained geographic data. You can get
to work creating your map view right away.

Build a map view


Start fresh with a new worksheet.

1. Click the New worksheet icon at the bottom of the workspace.

Tableau keeps your previous worksheet and creates a new one so that you can continue exploring your
data without losing your work.

333
2. In the Data pane, double-click State to add it to Detail on the Marks card.

Now you’ve got a map view!

Click the image to replay it

Learn more: Double-click to add geographic fields

Because Tableau already knows that state names are geographic data and because the State dimension
is assigned the State/Province geographic role, Tableau automatically creates a map view.

There is a mark for each of the 48 contiguous states in your data source. (Sadly, Alaska and Hawaii
aren't included in your data source, so they are not mapped.)

Notice that the Country field is also added to the view. This happens because the geographic fields in
Sample - Superstore are part of a hierarchy. Each level in the hierarchy is added as a level of detail.

Additionally, Latitude and Longitude fields are added to the Columns and Rows shelves. You can think
of these as X and Y fields. They're essential any time you want to create a map view, because each
location in your data is assigned a latitudinal and longitudinal value. Sometimes the Latitude and
Longitude fields are generated by Tableau. Other times, you might have to manually include them in
your data. You can find resources to learn more about this in the Learning Library.

Now, having a cool map focused on 48 states is one thing, but you wanted to see what was happening
in the South, remember?

334
3. Drag Region to the Filters shelf, and then filter down to the South only. The map view zooms in to the
South region, and there is a mark for each state (11 total).

Now you want to see more detailed data for this region, so you start to drag other fields to the Marks
card:

4. Drag the Sales measure to Color on the Marks card.

Click the image to replay it

The view automatically updates to a filled map, and colors each state based on its total sales. Because
you're exploring product sales, you want your sales to appear in USD. Click the Sum(Sales) field on
the Columns shelf, and select Format. For Numbers, select Currency.

Any time you add a continuous measure that contains positive numbers (like Sales) to Color on the
Marks card, your filled map is colored blue. Negative values are assigned orange.

Sometimes you might not want your map to be blue. Maybe you prefer green, or your data isn’t
something that should be represented with the color blue, like wildfires or traffic jams. That would just
be confusing!

No need to worry, you can change the color palette just like you did before.

5. Click Color on the Marks card and select Edit Colors.

For this example, you want to see which states are doing well, and which states are doing poorly in
sales.

6. In the Palette drop-down list, select Red-Green Diverging and click OK. This allows you to see quickly
the low performers and the high performers.

335
Your view updates to look like this:

But wait. Everything just went red! What happened?

The data is accurate, and technically you can compare low performers with high performers, but is that
really the whole story?

Are sales in some of those states really that terrible, or are there just more people in Florida who want
to buy your products? Maybe you have smaller or fewer stores in the states that appear red. Or maybe
there’s a higher population density in the states that appear green, so there are just more people to buy
your stuff.

Either way, there’s no way you want to show this view to your boss because you aren't confident the
data is telling a useful story.

7. Click the Undo icon in the toolbar to return to that nice, blue view.

There’s still a color problem. Everything looks dandy—that’s the problem!

At first glance, it appears that Florida is performing the best. Hovering over its mark reveals a total of
89,474 USD in sales, as compared to South Carolina, for example, which has only 8,482 USD in sales.
However, have any of the states in the South been profitable?

8. Drag Profit to Color on the Marks card to see if you can answer this question.

336
Now that’s better! Because profit often consists of both positive and negative values, Tableau
automatically selects the Orange-Blue Diverging color palette to quickly show the states with negative
profit and the states with positive profit.

It’s now clear that Tennessee, North Carolina, and Florida have negative profit, even though it appeared they
were doing okay—even great—in Sales. But why? You'll answer that in the next step.

Check your work! Watch "Build a map view" in action

Click the image to replay it

337
Step 5: Drill down into the details

In the last step you discovered that Tennessee, North Carolina, and Florida have negative profit. To
find out why, you decide to drill down even further and focus on what's happening in those three states
alone.

Pick up where your map view left off


As you saw in the last step, maps are great for visualizing your data broadly. A bar chart will help you
get into the nitty-gritty. To do this, you create another worksheet.

1. Double-click Sheet 3 and name the worksheet Profit Map.

2. Right-click Profit Map at the bottom of the workspace and select Duplicate . Name the new
sheet Negative Profit Bar Chart.

3. In the Negative Profit Bar Chart worksheet, click Show Me, and then select horizontal bars.

Show Me highlights different chart types based on the data you've added to your view.

Note: At any time, you can click Show Me again to collapse it.

338
You now have a bar chart again—just like that.

339
4. To select multiple bars on the left, click and drag your cursor across the bars
between Tennessee, North Carolina, and Florida. On the tooltip that appears, select Keep
Only to focus on those three states.

Note: You can also right-click one of the highlighted bars, and select Keep Only.

Notice that an Inclusions field for State is added to the Filters shelf to indicate that certain
states are filtered from the view. The icon with two circles on the field indicates that this field
is a set. You can edit this field by right-clicking the field on the Filters shelf and selecting, Edit
Filter.

Now you want to look at the data for the cities in these states.

5. On the Rows shelf, click the plus icon on the State field to drill-down to the City level of
detail.

There’s almost too much information here, so you decide to filter the view down to the cities with the
most negative profit by using a Top N Filter.

Check your work! Watch steps 1-5 in action

Click the image to replay it

340
Activity 2:
Create a Top N Filter

Solution:
You can use a Top N Filter in Tableau Desktop to limit the number of marks displayed in your view. In this
case, you want to use the Top N Filter to hone in on poor performers.

1. From the Data pane, drag City to the Filters shelf.

2. In the Filter dialog box, select the Top tab, and then do the following:

a. Click By field.

b. Click the Top drop-down and select Bottom to reveal the poorest performers.

341
c. Type 5 in the text box to show the bottom 5 performers in your data set.

Tableau Desktop has already selected a field (Profit) and aggregation (Sum) for the Top N Filter
based on the fields in your view. These settings ensure that your view will display only the five
poorest performing cities by sum of profit.

d. Click OK.

What happened to the bar chart, and why is it blank? That's a great question, and a great opportunity to
introduce the Tableau Order of Operations.

The Tableau Order of Operations, also known as the query pipeline, is the order that Tableau performs
various actions, such as the order in which it applies your filters to the view.

Tableau applies filters in the following order:

a. Extract Filters

b. Data Source Filters

c. Context Filters

d. Top N Filters

e. Dimension Filters

f. Measure Filters

The order that you create filters in, or arrange them on the Filters shelf, doesn't change the order in
which Tableau applies those filters to your view.

The good news is you can tell Tableau to change this order when you notice something strange
happening with the filters in your view. In this example, the Top N Filter is applied to the five poorest
performing cities by sum of profit for the whole map, but none of those cities are in the South, so the
chart is blank.

342
To fix the chart, add a filter to context. This tells Tableau to filter that field first, regardless of where it
falls on the order of operations.

But which field do you add to context? There are three fields on the Filters shelf: Region (a dimension
filter), City (a top N filter), and Inclusions (Country, State) (Country, State) (a set).

If you look at the order of operations again, you know that the set and the top N filter are being applied
before the dimension filter. But do you know if the top N filter or the set filter is being applied first?
Let's find out.

3. On the Filters shelf, right-click the City field and select Add to Context.

The City field turns gray and moves to the top of the Filters shelf, but nothing changes in the view. So
even though you're forcing Tableau to filter City first, the issue isn't resolved.

4. Click Undo.

5. On the Filters shelf, right-click the Inclusions (Country, State) (Country, State) set and select Add to
Context.

The Inclusions (Country, State) (Country, State) set turns gray and moves to the top of the Filters shelf.
And bars have returned to your view!

You're on to something! But there are six cities in the view, including Jacksonville, North Carolina,
which has a positive profit. Why would a city with a positive profit show up in the view when you
created a filter that was supposed to filter out profitable cities?

Jacksonville, North Carolina is included because City is the lowest level of detail shown in the view.
For Tableau Desktop to know the difference between Jacksonville, North Carolina, and Jacksonville,
Florida, you need to drill down to the next level of detail in the location hierarchy, which, in this case,
is Postal Code. After you add Postal Code, you can exclude Jacksonville in North Carolina without also
excluding Jacksonville in Florida.

6. On the Rows shelf, click the plus icon on City to drill down to the Postal Code level of detail.

7. Right-click the postal code for Jacksonville, North Carolina, 28540, and then select Exclude.

Postal Code is added to the Filters shelf to indicate that certain members in the Postal Code field have
been filtered from the view. Even when you remove the Postal Code field from the view, the filter
remains.

8. Drag Postal Code off the Rows shelf.

Your view updates to look like this:

343
Check your work! Watch steps 1-8 in action

Click the image to replay it

344
Now that you've focused your view to the least profitable cities, you can investigate further to
identify the products responsible.

Identify the trouble makers:


You decide to break up the view by Sub-Category to identify the products dragging profit down. You
know that the Sub-Category field contains information about products sold by location, so you start
there.

1. Drag Sub-Category to the Rows shelf, and place it to the right of City.

2. Drag Profit to Color on the Marks card to make it easier to see which products have negative
profit.

3. In the Data pane, right-click Order Date and select Show Filter.

You can now explore negative profits for each year if you want, and quickly spot the products
that are losing money.

Machines, tables, and binders don’t seem to be doing well. So what if you stop selling those
items in Jacksonville, Concord, Burlington, Knoxville, and Memphis?

345
Verify your findings
Will eliminating binders, machines, and tables improve profits in Florida, North Carolina, and
Tennessee? To find out, you can filter out the problem products to see what happens.

1. Go back to your map view by clicking the Profit Map sheet tab.

2. In the Data pane, right-click Sub-Category and select Show Filter.

A filter card for all of the products you offer appears next to the map view. You'll use this filter
later.

3. From the Data pane, drag Profit and Profit Ratio to Label on the Marks card. To format the
Profit Ratio as a percentage, right-click Profit Ratio, and select Format. Then, for Default
Numbers, choose Percentage and set the number of decimal places you want displayed on the
map. For this map, we'll choose zero decimal places.

Now you can see the exact profit of each state without having to hover your cursor over them.

4. In the Data pane, right-click Order Date and select Show Filter to provide some context for
the view.

346
A filter card for YEAR(Order Date) appears in the view. You can now view profit for all years
or for a combination of years. This might be useful for your presentation.

5. Clear Binders, Machines, and Tables from the list on the Sub-Category filter card in the view.

Recall that adding filters to your view lets you include and exclude values to highlight certain
parts of your data.

As you clear each member, the profit for Tennessee, North Carolina, and Florida improve, until
finally, each has a positive profit.

Click the image to replay it

Hey, you made an interesting discovery!

Binders, machines, and tables are definitely responsible for the losses in Tennessee, North
Carolina, and Florida, but not for the rest of the South. Do you notice how profit actually
decreases for some of the other states as you clear items from the filter card? For example, if
you toggle Binders on the Sub-Category filter card, profit drops by four percent in Arkansas.
You can deduce that Binders are actually profitable in Arkansas.

You want to share this discovery with the team by walking them through the same steps you
took.

347
6. Select (All) on the Sub-Category filter card to include all products again.

Learn more: More questions you could ask

Check your work! Watch "Verify your findings" in action

Click the image to replay it

Now you know that machines, tables, and binders are problematic products for your company. In
focusing on the South, you see that these products have varying impacts on profit. This might be a
worthwhile conversation to have with your boss.

Next, you'll assemble the work you've done so far in a dashboard so that you can clearly present your
findings.

348
Activity 3:
Build a dashboard to show your insights:

Solution:
You’ve created four worksheets, and they're communicating important information that your boss needs to
know. Now you need a way to show the negative profits in Tennessee, North Carolina, and Florida and explain
some of the reasons why profits are low.

To do this, you can use dashboards to display multiple worksheets at once, and—if you want—make them
interact with one another.

Set up your dashboard

You want to emphasize that certain items sold in certain places are doing poorly. Your bar graph view
of profit and your map view demonstrate this point nicely.

1. Click the New dashboard button.

2. In the Dashboard pane on the left, you'll see the sheets that you created. Drag Sales in the
South to your empty dashboard.

3. Drag Profit Map to your dashboard, and drop it on top of the Sales in the South view.

Your view will update to look like this:

349
Now you can see both views at once!

But sadly, the bar chart is a bit squished, which isn’t helping your boss understand your data.

Arrange your dashboard


It's not easy to see details for each item under Sub-Category from your Sales in the South bar chart.
Also, because we have the map in view, we probably don't need the South region column in Sales in
the South, either.

Resolving these issues will give you more room to communicate the information you need.

1. On Sales in the South, right-click in the column area under the Region column header, and
clear Show header.

350
2. Repeat this process for the Category row header.

You've now hidden unnecessary columns and rows from your dashboard while preserving the
breakdown of your data. The extra space makes it easier to see data on your dashboard, but let's
freshen things up even more.

3. Right-click the Profit Map title and select Hide Title.

The title Profit Map is hidden from the dashboard and even more space is created.

4. Repeat this step for the Sales in the South view title.

351
5. Select the first Sub-Category filter card on the right side of your view, and at the top of the
card, click the Remove icon .

6. Repeat this step for the second Sub-Category filter card and one of the Year of Order
Date filter cards.

7. Click on the Profit color legend and drag it from the right to below Sales in the South.

8. Finally, select the remaining Year of Order Date filter, click its drop-down arrow, and then
select Floating. Move it to the white space in the map view. In this example, it is placed just
off the East Coast, in the Atlantic Ocean.

Try selecting different years on the Year of Order Date filter. Your data is quickly filtered to
show that state performance varies year by year. That's nice, but it could be made even easier
to compare.

9. Click the drop-down arrow at the top of the Year of Order Date filter, and select Single Value (Slider).

Your view updates to look like this:

Learn more: Differentiate Floating vs. Tiled objects on a dashboard

352
Now your dashboard is looking really good! Now, you can easily compare profit and sales by year.
But that’s not so different from a couple pictures in a presentation—and you're using Tableau! Let's
make your dashboard more engaging.

Check your work! Watch "Arrange your dashboard" in action

Click the image to replay it

Add interactivity
Wouldn't it be great if you could view which sub-categories are profitable in specific states?

1. Select Profit Map in the dashboard, and click the Use as filter icon in the upper right
corner.

2. Select a state within the Southern region of the map.

The Sales in the South bar chart automatically updates to show just the sub-category sales in
the selected state. You can quickly see which sub-categories are profitable.

353
3. Click an area of the map other than the colored Southern states to clear your selection.

You also want viewers to be able to see the change in profits based on the order date.

4. Select the Year of Order Date filter, click its drop-down arrow, and select Apply to
Worksheets > Selected Worksheets.

5. In the Apply Filter to Worksheets dialog box, select All in dashboard, and then click OK.

This option tells Tableau to apply the filter to all worksheets in the dashboard that use this
same data source.

Explore state performance by year with your new, interactive dashboard!

Check your work! Watch "Add interactivity" in action

Here, we filter Sales in the South to only items sold in North Carolina, and then explore year by year
profit.

Click the image to replay it

354
Rename and go
You show your boss your dashboard, and she loves it. She's named it "Regional Sales and Profit," and
you do the same by double-clicking the Dashboard 1 tab and typing Regional Sales and Profit.

In her investigations, your boss also finds that the decision to introduce machines in the North Carolina
market in 2021 was a bad idea.

Your boss is glad she has this dashboard to explore, but she also wants you to present a clear action
plan to the larger team. She asks you to create a presentation with your findings.

Good thing you know about stories in Tableau.

Step 7: Build a story to present:


You want to share your findings with the larger team. Together, your team might reevaluate selling
machines in North Carolina.

Instead of having to guess which key insights your team is interested in and including them in a
presentation, you decide to create a story in Tableau. This way, you can walk viewers through your
data discovery process, and you have the option to interactively explore your data to answer any
questions that come up during your presentation.

Create your first story point


For the presentation, you'll start with an overview.

1. Click the New story button.

You're presented with a blank workspace that reads, "Drag a sheet here." This is where you'll
create your first story point.

Blank stories look a lot like blank dashboards. And like a dashboard, you can drag worksheets
over to present them. You can also drag dashboards over to present them in your story.

2. From the Story pane on the left, drag the Sales in the South worksheet onto your view.

3. Add a caption—maybe "Sales and profit by year"—by editing the text in the gray box above
the worksheet.

355
This story point is a useful way to acquaint viewers with your data.

But you want to tell a story about selling machines in North Carolina, so let's focus on that data.

Highlight machine sales


To bring machines into the picture, you can leverage the Sub-Category filter included in your Sales in
the South bar chart.

1. In the Story pane, click Duplicate to duplicate the first caption.

Continue working where you left off, but know that your first story point will be exactly as you
left it.

2. Since you know you’re telling a story about machines, on the Sub-Category filter, clear the
selection for (All), then select Machines.

356
Now your viewers can quickly identify the sales and profit of machines by year.

3. Add a caption to underscore what your viewers see, for example, "Machine sales and profit by
year."

You've successfully shifted the focus to machines, but you realize that something seems odd:
in this view, you can't single out which state is contributing to the loss.

You'll address this in your next story point by introducing your map.

Check your work! Watch "Create your first story point" and "Highlight machine sales" in action.

Click the image to replay it

357
Make your point
The bottom line is that machines in North Carolina lose money for your company. You discovered that
in the dashboard you created. Looking at overall sales and profit by year doesn't demonstrate this point
alone, but regional profit can.

1. In the Story pane, select Blank. Then, drag your dashboard Regional Sales and Profit onto
the canvas.

This gives viewers a new perspective on your data: Negative profit catches the eye.

358
2. Add a caption like, "Underperforming items in the South."

To narrow your results to just North Carolina, start with a duplicate story point.

1. Select Duplicate to create another story point with your Regional Profit dashboard.

2. Select North Carolina on the map and notice that the bar chart automatically updates.

3. Select All on the Year of Order Date filter card.

4. Add a caption, for example, "Profit in North Carolina, 2018-2021."

Now you can walk viewers through profit changes by year in North Carolina. To do this, you will
create four story points:

1. Select Duplicate to begin with your Regional Profit dashboard focused on North Carolina.

2. On the Year of Order Date filter, click the right arrow button so that 2018 appears.

359
3. Add a caption, for example, "Profit in North Carolina, 2018," and then click Duplicate.

4. Repeat steps 2 and 3 for years 2019, 2020, and 2021.

Now viewers will have an idea of which products were introduced to the North Carolina market when,
and how poorly they performed.

Check your work! Watch "Make your point" in action.

Click the image to replay it

Finishing touches
On this story point that focuses on data from 2021, you want to describe your findings. Let's add more
detail than just a caption.

1. In the left pane, select Drag to add text and drag it onto your view.

2. Enter a description for your dashboard that emphasizes the poor performance of machines in
North Carolina, for example, "Introducing machines to the North Carolina market in 2021
resulted in losing a significant amount of money."

360
For dramatic effect, you can hover over Machines on the Sales in the South bar chart while
presenting to show a useful tooltip: the loss of nearly $4,000.

And now, for the final slide, you drill down into the details.

3. In the Story pane, click Blank.

4. From the Story pane, drag Negative Profit Bar Chart to the view.

5. In the Year of Order Date filter card, narrow the view down to 2021 only.

You can now easily see that the loss of machine profits was solely from Burlington, North
Carolina.

6. In the view, right-click the Burlington mark (the bar) and select Annotate > Mark.

7. In the Edit Annotation dialog box that appears, delete the filler text and type: "Machines in Burlington
lost nearly $4,000 in 2021."

8. Click OK.

9. In the view, click the annotation and drag it to adjust where it appears.
361
10. Give this story point the caption: "Where are we losing machine profits in North Carolina?"

11. Double-click the Story 1tab and rename your story to "Improve Profits in the South".

12. Review your story by selecting Window > Presentation mode.

Check your work! Watch "Finishing touches" in action.

Click the image to replay it

Step 8: Share your findings:


You've done a bunch of work—great work—to learn that Burlington, North Carolina needs some fine
tuning. Let's share this information with your teammates.

Before you continue, select an option below:

• If you or your company does not use Tableau Server, or if you want to learn about a free,
alternative sharing option, jump to Use Tableau Public.

362
• If you or your company uses Tableau Server, and you are familiar with what permissions are
assigned to you, jump to Use Tableau Server.

Use Tableau Public


Your story was a hit. You're going to publish it to Tableau Public so that your team can view it online.

Note: When you publish to Tableau Public, as the name suggests, these views are publicly accessible.
This means that you share your views as well as your underlying data with anyone with access to the
internet. When sharing confidential information, consider Tableau Server(Link opens in a new
window) or Tableau Cloud(Link opens in a new window).

1. Select Server > Tableau Public > Save to Tableau Public.

2. Enter your Tableau Public credentials in the dialog box.

If you don't have a Tableau Public profile, click Create one now for free and follow the
prompts.

3. If you see this dialog box, open the Data Source page. Then in the top-right corner, change
the Connection type from Live to Extract.

363
4. For the second (and last) time, select Server > Tableau Public > Save to Tableau Public.

5. When your browser opens, review your embedded story. It will look like this:

6. Click Edit Details to update the title of your viz, add a description, and more.

7. Click Save.

Your story is now live on the web.

364
8. To share with colleagues, click Share at the bottom of your viz.

9. How do you want to share your story?

a. Embed on your website: Copy the Embed Code and paste it in your web page HTML.

b. Send a link: Copy the Link and send the link to your colleagues.

c. Send an email using your default email client by clicking the email icon.

d. Share on Twitter or Facebook by clicking the appropriate icon.

Use Tableau Server


Your story was a hit. You're going to publish it to Tableau Server so that your team can view it online.

Publish to Tableau Server

1. Select Server > Publish Workbook or click Share on the toolbar.

2. Enter the name of the server (or IP address) that you want to connect to in the dialog box and
click Connect.

3. In the Name field, enter Improve Profits in the South.

365
4. If you want, enter a description for reference, for example "Take a look at the story I built in
Tableau Desktop!"

5. Under Sheets, click Edit, and then clear all sheets except Improve Profits in the South.

Learn more: Share more than just your story.

6. Click Publish.

Tableau Server opens in your internet browser. If prompted, enter your server credentials.

The Publishing Complete dialog box lets you know that your story is ready to view.

Great work! You've successfully published your story using Tableau Server.

366
Send a link to your work

Let's share your work with your teammates so that they can interact with your story online.

1. In Tableau Server, navigate to the Improve Profits in the South story that you published. You
will see a screen like this:

If you had published additional sheets from your workbook, they would be listed alongside
Improve Profits in the South.

2. Click Improve Profits in the South.

Your screen will update to look like this:

367
Awesome! This is your interactive, embedded story.

3. From the menu, select Share.

4. How do you want to share your story?

a. Embed on your website by copying the Embed Code and pasting it in your web page
HTML.

b. Send a link by copying the Link and sending the link to your colleagues.

368
c. Send an email by using your default email client: Click the email icon

3) Graded Lab Tasks


Note: The instructor can design graded lab activities according to the level of difficult and complexity
of the solved lab activities. The lab tasks assigned by the instructor should be evaluated in the same
lab.
Lab Task 1:
• Gather data
• Structure the data
• Explore the data
• Share insights

369

You might also like