SAP Datasphere (SAP DSP)
DATA BUILDER
SAP DSP – ‘Data Builder’ Overview
Acquiring & Preparing Data
Modelers can directly import data from various connections and sources,
then use flows to replicate, extract, transform, and load data
seamlessly.
• Modeling Data
Once data is in, modelers can enrich it by adding semantic layers,
refining entities, and building focused analytical models. These models
are ready for use in SAP Analytics Cloud, Microsoft Excel, and other
integrated tools and applications.
kr_pavankumar
Local Table
1. Tables
Remote Table
Graphical View
2. View
SQL Views
3. ER Models
DATA BUILDER 4. Analytic Model
- Objects
Data Flow
5. Flows Replication Flow
Transformation Flow
6. Intelligent Lookups
7. Task Chain
8. Data Access Control
kr_pavankumar
1. Tables: Local vs. Remote
Remote Table: A table linked to an
external source via a connection. It
Local Table: A table created and
federates data by default (fetches
stored directly in your SAP
live from the source) but can be
Datasphere space. Use it to persist
replicated to SAP Datasphere for
data (e.g., from CSV uploads or
better performance. Perfect for
data flows) for modeling or
integrating external systems
analytics.
without duplicating everything
locally!
kr_pavankumar
2. Views: Graphical vs. SQL
a) Graphical View: A no-code way to model data using a drag-and-drop interface. Supports
Filters, rename/exclude columns, calculated columns, aggregation &
Joins.
• Filter: Restrict data rows based on conditions (e.g., sales > 1000).
• Rename/Exclude Columns: Tweak column names or drop unnecessary ones for clarity.
• Calculated Columns: Create new columns with formulas (e.g., profit = revenue - cost).
• Aggregation: Summarize data (e.g., SUM, AVG) for reporting.
• Join Suggestion: Auto-suggests joins between tables based on defined Associations – Saves time!
b) SQL View: Write SQL code for complex logic. Ideal for Business requirements needing
precise control—like subqueries or CTEs or Procedures or advanced joins—beyond
what graphical views offer.
kr_pavankumar
Fact: to indicate that your entity contains numerical
measures that can be analyzed.
Dimension: Defines master data (e.g., product lists) with
attributes for context.
3. Semantic Text: Adds multilingual text attributes (e.g., descriptions)
to enrich data.
Usage: giving Hierarchy: Sets up parent-child relationships (e.g., region
Data Meaning > city) for drill-downs.
Hierarchy with Directory: entity contains one or more
parent-child hierarchies and has an association to a directory
dimension containing a list of the hierarchies.
Relational Dataset: A neutral table/view with no
analytical purpose—just raw data for further processing.
kr_pavankumar
Create Table: Build a new table within the entity-
relationship diagram.
Add a Column: Expand a table with new fields for more
data.
4. ER Models:
Mapping Create View from Selection: Generate a view from
selected tables/views in the model.
Relationships
Create Association: Link entities (e.g., orders to
customers) for relational modeling.
Add Related Entities: Bring in connected tables/views to
complete the data picture.
Purpose: Visualize and manage data relationships intuitively!
kr_pavankumar
5. Analytic Model: Powering Analytics
a) Measures
Defi • Calculated Measure: Define custom metrics (e.g., profit margin = profit/revenue).
ne
Limi • Restricted Measure: Limit a measure by conditions (e.g., sales for 2024 only).
t
Cou • Count Distinct Measure: Count unique values (e.g., distinct customers).
nt
Con • Currency Conversion Measure: Convert values across currencies dynamically.
vert
Han • Non-Cumulative Measure: Handle non-additive metrics (e.g., stock levels).
dle
kr_pavankumar
5. Analytic Model: Powering Analytics
b) Variables
Standard Variable: User-input values for flexible queries.
Restricted Measure Variable: Filters a restricted measure dynamically.
Filter Variable: Applies runtime filters to data.
Reference Date Variable: Sets a date context for time-based calculations.
Usage: Prepares data for SAP Analytics Cloud with multidimensional flexibility!
kr_pavankumar
•
6. Flows: Moving and Transforming Data
1. Data Flow
✓ Purpose: General-purpose ETL (Extract, Transform, Load) pipeline for
data movement and transformation.
✓ Use Case: Combines aspects of both replication and transformation
✓ Key Features:
• Data can be extracted from multiple sources, transformed, and then loaded into
target tables.
• More flexible and customizable. Useful for building end-to-end data pipelines.
kr_pavankumar
•
6. Flows: Moving and Transforming Data
Data Flow: Operators available to create a Data flow
Source Source Table(s): Start with your input data.
Select/filter Projection: Select/filter columns.
Join Join: Combine tables.
Stack Union: Stack datasets.
Script Script (Python): Add custom logic with Python.
Summarize Aggregation: Summarize data.
Target Target Table: Write results to a table.
kr_pavankumar
6. Flows: Moving and Transforming Data
2. Replication Flow
✓ Purpose: Real-time or scheduled copying of data from a source system to SAP
Datasphere without significant transformation.
✓ Use Case: When you want to mirror or synchronize source tables as-is.
✓ Key Features:
• Supports real-time replication for supported sources (like SAP S/4HANA, SAP
BW).
• Minimal or no transformation.
• Ensures high performance and fast updates.
kr_pavankumar
•
6. Flows: Moving and Transforming Data
2. Replication Flow: Objects to define
Source Source Connection/Container/Objects: Define the origin.
Target Target Connection/Container/Objects: Set the destination.
Load Type Initial only/Initial & Delta: Full load or incremental updates.
Run/Schedule Run/Schedule: Execute manually or on a schedule.
kr_pavankumar
6. Flows: Moving and Transforming Data
3. Transformation Flow
✓ Purpose: Perform complex transformations on data after importing from source
systems.
✓ Use Case: When raw data needs to be cleaned, enriched, joined, filtered, or
aggregated before consumption.
✓ Key Features:
• Drag-and-drop interface to define transformation logic.
• Use of joins, filters, calculations, and aggregations.
• Output is often used for analytical models.
kr_pavankumar
•
6. Flows: Moving and Transforming Data
3. Transformation Flow: Advanced transformations
Graphical View SQL View Target:
Transform: Use Transform: Output to a
graphical views as
steps. Apply SQL logic. table.
kr_pavankumar
7. Intelligent Lookups: Smart Data Matching
Purpose: To match and enrich data from two different datasets—even if they
don’t have perfectly matching key fields.
Use Case: When you have master data and transactional data that share similar
(but not identical) attributes, and you want to intelligently map and combine
them.
Key Features:
• Uses machine learning to suggest the best matches between fields.
• Minimizes the need for exact joins or manual mapping.
• Great for data enrichment or combining messy, siloed datasets.
kr_pavankumar
7. Intelligent Lookups: Smart Data Matching
1 2 3 4
Input Table Node: Look up: Reference Rule(s): Define match Output View:
Your starting dataset. another table for logic—outputs include Resulting enriched
enrichment. Matched Records, view.
Unmatched, Purpose: Automates
Unprocessed, or Error. data lookups (e.g.,
adding customer
names to orders).
kr_pavankumar
➢Purpose: To automate and orchestrate a sequence of data-related tasks or
flows (like data load, transform, publish).
➢Use Case: When you want to schedule or automate a multi-step data pipeline
involving Data Flows, Transform Flows, or Replication
Flows.
➢Key Features:
• Build workflows of tasks.
• Configure dependencies and execution order.
• Supports scheduling and monitoring.
• Useful for end-to-end automation.
8. Task Chain: Orchestrating
Tasks
kr_pavankumar
1 2 3 4 5 6 7
Add Task: Connect: ALL/ANY Add as Add Run or Email
Operator: Parallel Placeholder: Schedule: Notifications:
Branch:
Include a flow Link tasks in
or job. sequence. Run parallel Reserve spots Trigger Alert users on
tasks with Execute tasks for future manually or completion.
success concurrently. tasks. automate.
conditions.
8. Task Chain: Orchestrating
Tasks
kr_pavankumar
➢Purpose: To enforce row-level or attribute-level security within your
datasets and models.
➢Use Case: When different users or roles should see only specific parts of a
dataset (e.g., region-based access for sales managers).
➢Key Features:
• Define DAC entities (e.g., region, country).
• Assign users or roles to specific DAC values.
• Integrated with SAP Analytics Cloud and other front-end tools for secure data
consumption.
• Ensures data privacy and governance compliance.
9. Data Access Control:
Securing Data
kr_pavankumar
Structure: Defines the DAC framework on a
view/model.
9. Data Access
Control: Permissions Entity: Links to a table with user/role
permissions.
Securing Data
Criteria: Sets row-level filters (e.g., “Region =
‘NA’” for specific users).
Purpose: Ensures users only see authorized data—
crucial for compliance!
kr_pavankumar
Thank you..!
kr_pavankumar