Bus Matrix
the foundation of your Data
Warehouse
Bill Anton
Prime Data Intelligence
What we will cover today:
Dimensional Modeling 101
What, Why, How
Common Challenges
Bus Matrix
What is it?
How does it help?
Examples
Kimball Dimensional DW
What is Dimensional Modeling?
Dimensional modeling
Logical design technique for
structuring data
It is intuitive to business users
Easy-to-understand
Fast query performance
Primary constructs of a dimensional
model
fact tables
dimension tables
What is Dimensional Modeling?
Facts
additive amounts
E.g. Sales amount,
inventory quantity
SUM, AVERAGE,
MAX, MIN, COUNT
Dimensions
descriptive
attributes
E.g. Date, Product,
Location, Customer
GROUP BY
<attribute>,
<attribute>, etc
Fact Table
Measurements associated with a specific business process
Grain: level of detail of the table
Process events produce fact records
Facts (attributes) are usually
Numeric
Additive
Derived facts included
Foreign (surrogate) keys refer to dimension tables (entities)
Classification values help define subsets
Grain (unit of analysis)
The grain determines what each fact record represents:
the level of detail.
For example
Individual transactions
Snapshots (points in time)
Line items on a document
Generally better to focus on the smallest grain
Dimension Tables
Entities describing the objects of the process
Conformed dimensions cross processes
Attributes are descriptive
Text
Numeric
Surrogate keys
Less volatile than facts (1:m with the fact table)
Null entries
Date dimensions
Produce by questions
E.g. Date
Dimensions
Fiscal Year
Calendar Year
Fiscal Quarter
Calendar
Quarter
Fiscal Month
Calendar
Month
Fiscal Week
Calendar
Week
Type of Day
Day of Week
Day
Holiday
Attribute Name
Attribute Description
Day
The specific day that an activity took
place.
Day of Week
The specific name of the day.
Holiday
Identifies that this day is a holiday.
Type of Day
Indicates whether or not this day is
a weekday or a weekend day.
Calendar Week
The week ending date, always a
Saturday. Note that WE denotes
Calendar Month
The calendar month.
Calendar Quarter
Calendar Year
Fiscal Week
Fiscal Month
Fiscal Quarter
Fiscal Year
Sample Values
06/04/1998; 06/05/1998
Monday; Tuesday
Easter; Thanksgiving
Weekend; Weekday
WE 06/06/1998;
WE 06/13/1998
January,1998; February,
1998
The calendar quarter.
1998Q1; 1998Q4
The calendar year.
1998
The week that represents the
F Week 1 1998;
corporate calendar. Note that the F F Week 46 1998
The fiscal period comprised of 4 or 5 F January, 1998;
weeks. Note that the F in the data
F February, 1998
The grouping of 3 fiscal months.
F 1998Q1; F1998Q2
The grouping of 52 fiscal weeks / 12 F 1998; F 1999
fiscal months that comprise the
financial year.
What is Dimensional Modeling?
Star Schema
CUSTOMER
customer_ID (PK)
customer_name
purchase_profile
credit_profile
address
STORE
store_ID (PK)
store_name
address
district
floor_type
CLERK
clerk_id (PK)
clerk_name
clerk_grade
ERD
ORDER
order_num (PK)
customer_ID (FK)
store_ID (FK)
clerk_ID (FK)
date
PRODUCT
SKU (PK)
description
brand
category
ORDER-LINE
order_num (PK) (FK)
SKU (PK) (FK)
promotion_key (FK)
dollars_sold
units_sold
dollars_cost
PROMOTION
promotion_NUM (PK)
promotion_name
price_type
ad_type
TIME
time_key (PK)
SQL_date
day_of_week
month
STORE
store_key (PK)
store_ID
store_name
address
district
floor_type
CLERK
clerk_key (PK)
clerk_id
clerk_name
clerk_grade
DIMENSONAL
MODEL
FACT
time_key (FK)
store_key (FK)
clerk_key (FK)
product_key (FK)
customer_key (FK)
promotion_key (FK)
dollars_sold
units_sold
dollars_cost
PRODUCT
product_key (PK)
SKU
description
brand
category
CUSTOMER
customer_key (PK)
customer_name
purchase_profile
credit_profile
address
PROMOTION
promotion_key (PK)
promotion_name
price_type
ad_type
What is Dimensional Modeling?
Denormalization
Repeating Values
Opposite of normalized (e.g. 3rd Normal Form)
Optimized for reads (not writes)
Why Dimensional Modeling
Intuitive to Business Users
Simpler than OLTP/3NF
Rise of Self-Service (E.g. Power Pivot, Power View)
Iterative Development
Agile
Performance
Optimized for analytical queries
e.g. sales amount by product in 2013 for top 10 all-time customers
And many more
See Teo Lachevs WHY SEMANTIC LAYER newsletter:
http://www.prologika.com/Newsroom/Newsletter2013Fall.aspx
Intuitive to Business Users
How many bikes did we sell last
year?
Do we sell more bikes to single or
married females?
What was our most/least profitable
product this year?
What was the Average Monthly Gross Margin
Return on Inventory Investment (GMROII) by
Product Category for the trailing 6 months?
Its
Complicated
Star-Schema
1 Star per Fact table
Sales Process
Inventory Process
Facts are related through
dimensions
Sales Process
Inventory Process
Facts are related through
Conformed Dimensions
dimensions
A conformed dimension is a set of data attributes
that have been physically referenced by
multiple fact tables using the same key
value to refer to the same structure, attributes,
domain values, definitions and concepts.
Dimensions are conformed when they are either
exactly the same (including keys) or one is a
perfect subset of the other.
Dimension tables are NOT conformed if the
attributes are labeled differently or contain
different values.
Revisiting Average Monthly Gross Margin
Return on Inventory Investment (GMROII)
Average
Monthly
GMROII
Profit for
total time
period
Sum of each month
ending inventory cost
Gross Margin
Return on Inventory Investment (GMROII)
What was the Average Monthly
by Product Category for the trailing 6 months?
Where things start to get complex
1 Star per Fact table
Multiple Fact tables per business process
Multiple business processes in an enterprise
Dimensional Model becomes a Galaxy
of Stars
Financ
e
Producti
on
Sales
Distributi
on
HR
ER Diagram: Adventure Works
Sample DW
For bigger Data Warehouses
This ^^
Turns into this ^^
Variety of Problems to Overcome
with Dimensional Modeling
Communication & Strategy
Whats the short term plan of attack?
Whats the long term plan of attack?
Documentation
Whats in our Data Warehouse?
Business Users cant read ER diagrams
Business Users are typically only familiar with a 1 or 2 business
processes
E.g. Sales User vs Inventory User; Warehouse Supervisor vs CEO
Conforming Dimensions is hardREALLY hard
So are changes (E.g. Impact Analysis)
Whats the Solution?
Train business users to read ER Diagrams?
Simplify Data Model?
Ignore certain business processes?
Dont use Conformed Dimensions?
Force business users to manually map data between
processes?
What about a Bus Matrix?
What is a Bus Matrix?
2-dimensional
visualization showing
the intersection of
facts and dimensions
Variety of Use-Cases for a Bus
Matrix
Documentation, Communication, Training
Facilitate User Adoption of BI tools
Communicate Expectations w/ Business
New users unfamiliar with new business process
Team Development
Agile
Prioritization of Tasks
Divide & Conquer
Road-Mapping
Prioritization of Business Processes in a Business Intelligence
Program
Documentation For Business
Documentation for IT
Master Bus Matrix
Team Development
Sprint 1
Internet Sales
Sprint 2
Reseller Sales
Road-Mapping
When To Create a Bus Matrix
During Requirements Gathering
Before You Start Development!
Updated Over Time
Changes to Business Processes
New Source Systems (E.g. mergers/acquisitions)
How To Create a Bus Matrix
Manual via Excel
Automated via SSRS
Manual
Only option when starting
out ;-)
Updates can be made quickly
made as requirements come in
Adds development overhead,
but the ROI is well worth it
Automated
Reporting pack with drillthrough to data dictionary
information
Can be based on Cube or
Relational Database (*FK
required)
Incorporate query statistics to
visualize common usage
patterns
Use MDS to allow SMEs to
manage business definitions
Based on example from Alex Whittles
http://www.purplefrogsystems.com/blog/2010/09/olap-cube-documentation-in-ssrs-part-1/
QUESTIONS
References
Twitter:
@SQLbyoBI
Blog: http://byoBI.com
Email:
[email protected]
http://byobi.com/blog/bus-mat
rix/