CHAPTER 9:
DATA CONNECTION
OPTIONS
1
Data Connection Options
■ Using Data Sources & Connections
■ Comparing Options (Live/Extract,
Relationships/Joins/Blends)
■ Joins & Cross-Database Joins
■ Data Blending (Concept & Use)
■ Unions
■ Filter Across Multiple Data Sources
2
Using Data Sources &
Connections
■ Connect to files (Excel/CSV/JSON) & databases (SQL
Server/Oracle/Postgres)
■ Save curated sources as .tds / .tdsx (reuse &
governance)
■ Prefer relationships for multi-table models at logical
layer
■ Use joins inside a logical table (physical layer)
when needed
3
Comparing Connection Options
■ Live vs Extract: real-time vs snapshot & speed
■ Relationships (logical, context-aware) vs Joins
(physical, row-level)
■ Cross-Database Join (same data source,
different connectors) vs Data Blending
(worksheet-level, post-aggregation)
4
Data Model: Logical vs Physical
■ Logical layer: relate tables (noodles), Tableau
chooses join at query time
■ Physical layer: inside each logical table, define
joins explicitly
■ Benefits: fewer duplicate rows, flexible
granularity, better performance
5
Joins (Physical)
■ Types: Inner, Left, Right, Full
■ Join on keys (e.g., Order ID, Customer ID)
■ Watch for row multiplication & nulls
■ Use Data Source → View Data & row counts to
validate
6
Practice: Joining Tables
(Superstore)
Goal: Join Orders with Returns and People.
1. Connect Sample – Superstore.xlsx.
2. Logical layer: open Orders → go into physical layer.
3. Add Returns → Left Join on Order ID (keep all Orders).
4. Add People → Left Join on Region.
5. Build view: Rows: Sub-Category; Columns: SUM(Sales);
Color: IF [Returned]='Yes' THEN 'Returned' ELSE 'Not
Returned' END.
6. Validate counts & ensure no unexpected duplicates.
7
Cross-Database Join
■ Join tables from different connectors in the same
data source (e.g., Excel + Text/CSV/DB)
■ Same cautions: row multiplication, matching keys,
data types
■ Prefer relationships if grains differ (e.g., Orders vs
Marketing by Sub-Category)
8
How Data Blending Works
■ Worksheet-level, post-aggregation; behaves like
left blend from Primary to Secondary
■ Link on common fields (link icon)
■ Secondary fields show orange chain; cannot use
for row-level calcs across sources
■ Good for: published data sources, different grains,
or when join is impractical
9
Using Data Blending (Steps)
■ Build view with Primary data source (blue check mark).
■ Add Secondary data source → define linking field(s)
(ensure same data type).
■ Place Secondary measures on view → Tableau blends
on linked fields.
■ Control grain via level of detail (Dimensions on view).
■ Validate by toggling links & checking mark counts.
10
Practice: Effect of Primary
Selection (Blend)
1. Source A: Superstore Orders (Sales/Profit).
2. Source B: marketing_expenses.csv (Marketing Spend by
Sub-Category).
3. Make Orders Primary; link on Sub-Category; plot Spend
vs Sales (scatter).
4. Switch Primary → make Marketing primary; rebuild view.
5. Observe changes: which marks disappear (no matching
keys), filter behaviors, totals.
11
Unions (Vertical)
■ Stack tables row-wise with same/similar schema
■ For files: Wildcard (Union) across folder; for DB:
union tables in same DB/schema
■ Adds Table Name field (origin tracking)
■ Use for: monthly files, yearly partitions, multi-
sheet Excel
12
Practice: Unions
■ Connect to a folder with Orders_2022.csv,
Orders_2023.csv, Orders_2024.csv.
■ In Data Source, Union (drag to New Union) → use
Wildcard Orders_*.csv.
■ Verify Table Name appears.
■ Build view: Rows: YEAR(Order Date); Columns:
Region; Text: SUM(Sales).
■ Compare totals vs per-file subsamples.
13
Filter Across Multiple Data
Sources
■ Option A (Related sources): Apply to Worksheets
→ All Using Related Data Sources
■ Option B (Universal): Parameter-based filter + calc
fields in each source
■ Consider Data Source Filters for governance;
Context Filters for performance
14
Practice: Filtering Across
Multiple Data Sources
A. Related data sources
1. Ensure both sources have [Region] (same type/values).
2. Put Region on Filters → Apply to Worksheets → All Using Related
Data Sources.
3. Confirm both sheets (from different sources) respond.
B. Parameter method (universal)
4. Create String Parameter: pRegion (allow list of Regions).
5. In each data source, create calc: [Region] = [pRegion] → use as
Filter = True.
6. Place pRegion control on dashboard; test synchronized filtering.
15