03 Master and Reference Data Management
03 Master and Reference Data Management
Management
Contents
1
Key Points and Master
Data Management (MDM)
Reference and Master Data
Management
Data Management 2%
Big Data 2%
Data Architecture 6%
Document & Content Management 6%
Data Ethics 2%
Data Governance 11%
Data Integration & Interoperability 6%
Master & Reference Data Management 10%
Data Modelling & Design 11%
Data Quality 11%
Data Security 6%
Data Storage & Operations 6%
Data Warehousing & Business Intelligence 10%
Metadata Management 11%
3
Reference and Master Data
Management
4
Reference and Master Data
Reference and Definition: Managing shared data to meet organizational goals, reduce risks associated with
data redundancy, ensure higher quality, and reduce the costs of data integration.
Management
2. Provide authoritative source of reconciled and quality-assessed master and reference data.
3. Lower cost and complexity through use of standards, common data models, and integration patterns.
Business
Drivers
• Shared reference and master data belongs to the organization, not to a particular
application or department
• Reference and Master Data Management is an ongoing Data Quality improvement
program; its goals cannot be achieved by one project alone
• Business data stewards are the authorities accountable for controlling reference data
values. Business data stewards work with data professionals to improve the quality of
reference and master data
• Golden data values represent the organization’s best efforts at determining the most accurate,
current, and relevant data values for contextual use. New data may prove earlier assumptions
to be false. Therefore, apply matching rules with caution, and ensure that any changes that
are made are reversible
• Replicate master data values only from the database of record
• Request, communicate, and, in some cases, approve of changes to reference data values
before implementation
6
WHAT IS EVENT/TRANSACTION DATA?
E V E N T D ATA E X A M P L E :
“Bob bought a Twix bar from Morrison's on Monday, Jan. 3rd at 4 p.m.
and paid using cash.”
WHO WHAT WHERE WHEN HOW QUANTITY AMOUNT
Bob Smith Twix bar Morrison's, Bath 16:00 Monday Cash 1 £0.60
January 3rd, 2011
7
7
About EVENT DATA
8
WHAT IS …
9
9
MDM BUSINESS DRIVERS
Meeting
Managing the
Organizational Managing Data
Costs of Data Reducing Risk
Data Quality
Requirements Integration
1. REPOSITORY
2. REGISTRY
A key difference is the
Standard number of fields that
“Hub” are stored centrally
3. HYBRID
architectures
4. VIRTUALIZED
11
EXAMPLE: CUSTOMER
Customer First name Last Date of birth Preferred Preferred Credit rating Occupation Car
code name delivery delivery
address line 1 address post
code
JBS005 Bob Smith 1985-12-25 Royal Crescent BA1 7LA A Information Audi
Architect R8
IDENTIFIERS
CORE FIELDS
ALL FIELDS
12
EXAMPLE: CUSTOMER
Customer First Last Date of birth Preferred Preferred Credit Occupation Car
code name name delivery address delivery address rating .…
line 1 post code
BS005 Bob Smith 1985-12-25 Royal Crescent BA1 7LA A Information Audi R8
Architect
ALL FIELDS
IDENTIFIERS
Repository
CORE
FIELDS
CORE FIELDS Hybrid
IDENTIFIERS
Registry
ALL FIELDS
NONE
Virtualised
13
System(s) of Origin
DATA OWNERS
DATA STEWARDS
Golden Record
14
DATA OWNERS
SYSTEM SYSTEM
DATA QUALITY USER INTERFACE
DATA STEWARDS
15
TYPICAL MDM COMPONENTS
SUPER-
DATA
SESSION MASTER MODELING
CHAINING DATA
(e.g. Customer)
DATA
DISTRIBUTION
QUALITY
ACCESS SOURCING
System 3 System 2
16
MDM Capabilities
DATA SYNCHRONIZATION
DATA INTEGRATION MASTER DATA “HUB” DATA DELIVERY
& ACQUISITION
MDM ARCHITECTURE
INTEGRATION
APPLICATION
MASTER DATA SERVICES
DATA DISCOVERY & INTAKE
MASTER DATA MODELS
REFERENCE DATA
DATA QUALITY
INTERFAC
CONSOLIDATIO
CONTROL
METADATA MANAGEMENT
ACCESS
E
BUSINESS RULES
USE
SERVICES
N
BUSINES
R
HIERARCHY MANAGEMENT
S
MASTER DATA “REPOSITORY”
18
1. MASTER DATA “HUB”
• Data transformation
• Data discovery to assess quality and metadata discovery (some MDM tool
vendors either bundle it or offer it as a separate product)
• Some vendors bundle their MDM platforms with data integration and
Data Quality/cleansing capabilities, while other vendors have
partnerships to provide these functions.
20
3. MASTER DATA SERVICES
• For each modeled master object, basic master data services should be
configured to:
• create
• read
• MDM products may provide either a service library or the means for
creating master data services as master object models are integrated
or enhanced
21
4. DATA DELIVERY
• “look up product”
22
5. ACCESS CONTROL
23
6. SYNCHRONIZATION
24
7. DATA QUALITY, GOVERNANCE, AND OPERATIONS
• Inspection
• Data Correction
25
MASTER DATA MATCH RULES
2. Match-merge rules
Match records and merge the data from these records into a single, unified,
reconciled, and comprehensive record. If the rules apply across data sources, create
a single unique and comprehensive record in each database.
3. Match-link rules
Identify and cross-reference records that appear to relate to a master record
without updating the content of the cross-referenced record. Match-link rules are
easier to implement and much easier to reverse.
Rules around the matching, merging and linking of data from multiple systems about
the same person, group, place or thing.
26
MDM … A HUB IS NOT THE ONLY WAY
MIDDLEWARE SYNCHRONISATION LAYER SYNCHRONISATION LAYER
LEGAC LEGAC
ERP CRM ERP CRM
Y Y
SINGLE DATA
MASTER DATA CONTRIBUTORS SOURCE
27
MDM … A HUB IS NOT THE ONLY WAY
INDUSTRY
OPERATIONAL SPECIFIC
DATA-STORE MASTER
E.G., CUSTOMER
MASTER
SYNCHRONIZATION LAYER
EXTRACT-TRANSFORM-LOAD
SYNCHRONIZATION LAYER
Master Overlay
Hub-Based Master
28
MDM … A HUB IS NOT THE ONLY WAY
MASTER
DATA
SAP 1 DSL LEGACY
ORACLE …..
SAP ….. …..
SAP 3 DW CRM ….. …..
SIEBEL …..
30
MASTER DATA SHARING ARCHITECTURE (EXAMPLE)
Each master data subject area typically has its own Transaction Hub, applications interface with the hub to access and
update master data.
system of record. + Master data exists within the Transaction Hub and not within any other
applications.
Example shown hub-and-spoke architecture for master data.
+ Transaction Hub is the system of record for master data.
The master data hub handles interactions with spoke items such as source + Transaction Hubs enable better governance and provide a consistent
systems, business applications, and data stores while minimizing the source of master data.
number of integration points. -However, it is costly to remove the functionality to update master data
from existing systems of record.
A local data hub can extend and scale the master data hub. -Business rules are implemented in a single system … the Hub.
Registry is an index that points to master data in the various systems of Consolidated approach is a hybrid of Registry and Transaction Hub.
record. + Systems of record manage master data local to their applications.
+ Systems of record manage master data local to their applications. + Master data is consolidated within a common repository and made
+ Access to master data comes from the master index. available from a data-sharing hub, the system of reference for master
+ A registry is relatively easy to implement because it requires few data.
changes in the systems of record. + Eliminates the need to access directly from the systems of record.
-Often, complex queries are required to assemble master data from + Consolidated approach provides an enterprise view with limited impact
multiple systems. on systems of record.
-Moreover, multiple business rules need to be implemented to address -However, it entails replication of data.
semantic differences across systems in multiple places. -Latency between the hub and the systems of record.
31
KEY PROCESSING STEPS FOR MDM
32
SINGLE-Domain and Multi-Domain MDM
SINGLE-DOMAIN MULTI-DOMAIN
• E.g., Product (PIM), Customer (UCM), Vendor, • Highly configurable MDM solutions
Laboratory (LIM) • Informed by data model
• Incorporate specific “domain” features …
• Fewer specific data domain features
• House holding
• Address chaining
• Interfaces to mainstream apps
• Company hierarchies
• Results in fewer MDM solutions throughout the
enterprise
• Tailored data matching (deterministic or probabilistic)
33
IMPLEMENTATION: OPERATIONAL VS. ANALYTICAL MDM
OPERATIONAL ANALYTICAL
• MDM (usually a hub) created for use in data warehouse/
Business Intelligence solutions
• MDM (whatever is most appropriate architecture) created for
• Data integration (e.g., ETL) from source systems still required
use in LIVE operational Business solutions
• Data cleanse, consolidate, merge, de-duplicate, enrich, etc. still
• Data integration (e.g., ETL) from source systems essential required
• Data cleanse, consolidate, merge, de-duplicate, enrich, etc. • Master data used for analytical purposes to ensure consistency of
essential analyses, findings, etc.
• Master data is used by the live operational line-of-business • Master data NOT used by live operational line-of-business
solutions, therefore operational systems do not see any
solutions
benefits of MDM approach
• Essential to address business rules (such as survivorship, • NOT essential to address business rules (such as survivorship,
channels, attribute federation, Data Governance) for channels, governance) for operational MDM
operational MDM
• Easier (organizationally) to implement
• More difficult (technically and organizationally) to implement • Impact of failure only on BI activities
• Impact of failure severe
• Benefits extensive across the organization
34
Characteristics
MASTER VS. REFERENCE DATA FOCUS
36
REFERENCE VS MASTER DATA
Characteristic Reference Data Master Data
Number of Values Low, fixed/known Medium-high, variable/unknown
38
IT’S NOT “THE FIELD OF DREAMS”
• Subsets of the master data must be delivered “just in time” as the business projects need to
use them
• This means aligning the MDM initiatives with the business projects
39
Aligning MDM
ALIGNING MDM WITH BUSINESS INITIATIVES
41
REINVENTING THE WHEEL?
44
CONCLUSIONS
45
Reference and Master
Data Management in
NDMO
NDMO Guiding Principles
47
NDMO Guiding Principles
48
Discussion
3 To 5 Minutes
Questions
Reference and Master Data
Management
Data Management 2%
Big Data 2%
Data Architecture 6%
Document & Content Management 6%
Data Ethics 2%
Data Governance 11%
Data Integration & Interoperability 6%
Master & Reference Data Management 10%
Data Modelling & Design 11%
Data Quality 11%
Data Security 6%
Data Storage & Operations 6%
Data Warehousing & Business Intelligence 10%
Metadata Management 11%
51
Ref Question A B C D E
The need to improve Data Quality and The need to build a data
MDM What is a common motivation for Regulatory acts such as The need to consolidate all data Business Intelligenc e
data integrity across multiple data dictionary of all core data entities
1 Reference & Master Data Management? BCBS239, GDPR, and SOX into one physical database data warehousin &
sources and attributes
g
Data about the business
MDM Which of these is a valid definition of Data that if missing or incorrect will cause Data that is only held in one data Data that other data sits entities that provide
Data that rarely, if ever, changes
2 master data? transactions and processes to fail source hierarchically beneath context for business
transactions
Data used to classify or Data that provides Data that has a common
MDM Which of these is a valid definition of Data that is widely accessed and
Data that is fixed and never changes categorize other data metadata about other data and widely understood
3 reference data? referenced across an organization
entities data definition
Which of the following is NOT a primary Producing read-only Producing clear data
MDM Generating a golden record/best version Providing access to golden data
Master Data Management area of Identifying duplicate records versions of key data items definitions for master
4 of the truth records
focus? data
52
Ref Question A B C D E
MDM What is a common motivation for Reference The need to improve Data The need to build a data
Regulatory acts such as The need to consolidate all data into Business Intelligence and data
Quality and data integrity across dictionary of all core data entities
1 & Master Data Management? BCBS239, GDPR, and SOX one physical database warehousing
multiple data sources and attributes
MDM A common driver for initiating a Reference It will improve Data Quality and Managing codes and
It will consolidate the process of
facilitate analysis across the It can be a one-time-only project descriptions requires little effort
6 Data Management program is: securing third party code sets
organization and low cost