CDMP Study Group
SESSION 2- Chapter 1 Data Management
FEBRUARY 19, 2020
Tony Mazzarella, President
Email:
[email protected]AGENDA
• Facilitator
• Introductory Note
• Review Homework
• Chapter 1: Data Management
• What is Data?
• Why we need data management?
• Data and Information
• Data as an Asset
• Data Management Principles
• Data Management Challenges
• Data Management Strategy
• Data Management Frameworks
• DAMA and the DMBOK
• Q&A
• Next Session
New England Data Management Community
Facilitator
Tony Mazzarella, CDMP
Data Governance Director
President, DAMA New England
Chairman, DAMA I President’s Council
Add Picture
CONTACT INFO:
EMAIL:
[email protected] PHONE: 860-879-5325
: /IN/TONYMAZZ
New England Data Management Community
INTRODUCTORY NOTE
This study group is offered as a service of DAMA New England for DAMA New England
members. It not an official, DAMA International authorized training course because DAMA-I has
not yet created an authorized trainer program.
The purpose of this group is to help prepare members to take the CDMP. We will do so by
reviewing the content of chapters of the DMBOK2.
The chapter makes no claims for the effectiveness of the sessions or the ability of participants to
pass the CDMP exam after having attended. In fact, you should plan on doing a lot of individual
study to pass the exam.
New England Data Management Community
HOMEWORK REVIEW
What is the one thing you have learned by reading chapter 1 on “Data
Management” that you did not know before?
New England Data Management Community
Chapter 1: Data Management
Data Management is the development, execution, and supervision of plans, policies,
programs, and practices that deliver, control, protect, and enhance the value of data
and information assets throughout their lifecycle.
Business Drivers:
Competitive advantage – better data = better decisions
Failure to mange date results in waste and lost opportunity
Data as an asset - Primary driver of data management is to enable organizations to get
value from their data assets – similar to managing financial and physical assets.
Goals:
• Understanding and supporting the information needs of the enterprise and its
stakeholders including customer, employees and business partners
• Capturing, storing, protecting, and ensuring the integrity of data assets
• Ensuring the quality of data and information
• Ensuring the privacy and confidentiality of stakeholder data
• Preventing unauthorized or inappropriate access, manipulation, or use of data and
information
• Ensuring data can be used effectively and to add value to the enterprise
New England Data Management Community
WHAT IS DATA?
Definitions: Understanding Data:
• Most definitions emphasize role of emphasizing facts • Understanding context requires a representational system
• In IT also refers to information stored in digital format which includes a common vocabulary and relationships
between components
• If you understand the conventions of this system, you can
interpret the data within it
• “These conventions are often documented in a specific
kind of data referred to as Metadata” (DMBOK)
Data Requires Context to be meaningful: Data is different than other assets:
• “Data is a means of representation. It stands for things • Not tangible
other than itself”(Chisholm, 2010). • It is durable, does not wear; value does change
• “Data is both an interpretation o the objects it • Easy to transport; difficult to replace if lost or destroyed
represents and an object that must be interpreted” • Not consumed when used; can be stolen w/o being gone
(Sebastian-Coleman, 2013). • Same data can be used by multiple people at same time
• Many uses of data beget more data
New England Data Management Community
WHY WE NEED DATA MANAGEMENT
The correct interpretation of data requires understanding of the representation system
People and organization may make different choices on how to represent concepts; these
choices will impact how data is interpreted.
There are multiple ways to represent the SAME IDEA
This challenge is why we need Data Architecture, modeling, governance, stewardship, metadata
data quality management
Changes in technology have expanded the scope of this need as they have changed people’s
understanding of what data is.
New England Data Management Community
DATA AND (versus) INFORMATION
Idea that the relationship of data = raw material and information=data
in context is problematic for data management
• Data does not simply exist, it needs to be created
• It takes knowledge to create data in the first place
• Implies that data and information are separate things
• Data is a form of information/ information is a form of data
Key takeaway: Data and Information are used interchangeably in the
context of Data Management
New England Data Management Community
DATA AS AN ASSET
An asset is an economic resource that can be owned or controlled,
and that holds or produces value. Assets can be converted to money.
Data is widely recognized as an enterprise asset, but the
understanding of what that means is still evolving.
Organizations rely on their data assets to make effective decisions
and operate efficiently
Many companies identify as “data-driven” - this must include the
recognition that data must be managed efficiently with professional
discipline through a partnership between business leadership and IT.
New England Data Management Community
DATA MANAGEMENT PRINCIPLES 1/2
DMBOK Figure 1 Data Management Principles
New England Data Management Community
DATA MANAGEMENT PRINCIPLES 2/2
DMBOK Figure 1 Data Management Principles
New England Data Management Community
DM CHALLENGES – DATA ASSET / VALUATION & RISK
Because data management has distinct characteristics derived from the properties of data itself, it also presents
challenges in following these principles.
Data is different from other assets (refer to slide 7) make it different to
determine value/ monetize data Each organization must articulate general
cost or benefit categories for the valuation
Since data is unique to each organization valuation techniques will differ – of data assets. Examples:
value of data is contextual • Cost of obtaining/storing data
• Cost of replicating data if lost
Establishing ways to associate financial value to data is critical for data • Impact to organization if data
management) missing/incorrect
• Cost of risk event/ risk mitigation
Data not only represents value, it also represents risk. Low quality data is • Benefits of high quality data
risky; so is data’s potential to be misunderstood and/or misused. • Cost of data if sold
Data Quality
Information Gaps (what we know/need to know)
• Revenue form innovative uses of
Privacy / Security data
New England Data Management Community
DM CHALLENGES – DATA QUALITY (DQ)
Ensuring data is of high quality is central to data management
DQ historically has been an afterthought because data has been associated with IT
Poor DQ is costly – IBM estimated cost of poor DQ in US was $3.1 Trillion
Costs of Poor Data Quality: Benefits of High Data Quality:
• Scrap and rework • Improved CX
• Workarounds • High productivity
• Low productivity • Reduced risk
• Conflict • Act on opportunities
• Employee and customer dissatisfaction • Increased Revenue
• Opportunity Costs • Competitive advantage
• Compliance costs, fines and
reputational harm
New England Data Management Community
DM CHALLENGES – PLANNING FOR BETTER DATA
Deriving value from data requires planning! Challenges in planning are organizational pressures as well as resources
(time/money).
Decisions throughout the data
Orgs must recognize they can control how they obtain and create data. lifecycle require systems thinking,
If they view data as a product, they will make better decisions about it because they involve:
through it’s lifecycle • Data connects to business
processes
Planning requires collaboration and strategic approach to architecture, • Relationships between business
modeling, and other design functions processes and technology
• Design and architecture of
Orgs need to understand impact of technology on data to prevent systems and the data
technological temptation from driving their decisions about data produced/stored
• How data might be used to
Orgs must balance long and short term goals. Tradeoffs must be advance organizational strategy
considered
New England Data Management Community
DM CHALLENGES – METADATA AND DATA MANAGEMENT
Reliable metadata is required to manage data as an asset. Metadata is a form of data which must be managed.
Metadata includes:
Orgs that do not manage data well, typically do not manage their
Metadata at all • Business Metadata
• Technical Metadata
• Operational Metadata
Metadata describes what data an org has, what it represents, how it is • Data Architecture Metadata
classified, where it came from, how it moves, how it evolves, who can use
it and its quality • Data Models
• Data Security Requirements
• Data Integration Standards
Metadata make data, the data lifecycle, and the complex systems that • Data Operational Processes
contain data comprehensible
Metadata management often provides a starting point for improvements
in data management overall
New England Data Management Community
DM CHALLENGES – CROSS FUNCTIONAL
Data Management is a complex process.
Data Management Skills:
Data is managed in different places across an org by teams that
have responsibility for different phases of the data lifecycle • Design Skills
• Technical Skills
Data Management requires collaboration and coordination of • Data Analysis Skills
people with range of skills and perspectives to recognize how the
pieces fit together to work towards common goals. • Analytic Skills (Interpret data)
• Language Skills
• Strategic Thinking
New England Data Management Community
DM CHALLENGES – EST. AN ENTERPRISE PERSPECTIVE
Managing data requires understanding the scope and range of data within and across and organization and making it fit
together in common sense ways.
Data not only unique to an organization, it can be unique to a department Other Perspectives:
or business-unit
• Internal / External data sources
Data is often not planned for beyond the immediate need • Legal and compliance reqs
• Knowledge of potential uses;
who else will use it eventually
Different departments may have different ways of representing the same
concept – subtle or blatant differences can create challenges in managing • Account for the fact data can
data be misused
Stakeholders assume that an organization’s data should be coherent
Data Governance is vital to helping organizations make decisions about
data across verticals
New England Data Management Community
DM CHALLENGES – THE DATA LIFECYCLE (DLC)
Like other assets, data has a lifecycle. To manage data you must understand and plan for the lifecycle. A strategic
organization will not only define data content requirements, but also data management requirements.
Data Lifecycle is based on the Product Lifecycle
Managing data involves a set of interconnected processes aligned
with the data lifecycle
Specifics of a lifecycle can be complex; data not only has a lifecycle
it has lineage
Implications of Data Management on the Data Lifecycle:
• Creation and usage are most critical points in DLC
• DQ must be managed throughout the DLC Data Lifecycle Activities
• Metadata Quality must be managed through the DLC
• Data Security must be managed throughout the DLC
• Data Management Efforts should focus on the most critical data
New England Data Management Community
DM CHALLENGES – DIFFERENT TYPES OF DATA
Managing data is made more complicated by the fact that there are different types of data that have different
lifecycle management requirements.
Types of data:
Any management system needs to classify the objects that are • Transactional data
managed • Reference Data
• Master Data
• Metadata
Different types of data have different requirements, risks and • Category Data
roles within and organization • Resource Data
• Event Data
DM tools are focused on aspects of classification and control • Detailed Transaction Data
Other categories:
• Data Domains
• Subject Areas
• Format
• Level of protection
• Location
New England Data Management Community
DM CHALLENGES – LEADERSHIP & COMMITTMENT
Data Management is neither easy nor simple. Few organizations do it well, so it is a source of untapped opportunity.
To become better it requires vision, planning, and willingness to change.
Most orgs recognize their data as and asset; far from being data-driven
Don’t know what data they have; what data is critical
Confuse data and IT; mismanage both
Do not take a strategic approach
CDO can lead data management activities- lead initiatives and cultural change that enables a
more strategic approach to data
New England Data Management Community
DATA MANAGEMENT STRATEGY
Strategic planning includes:
Data Strategy • Vision
Comes from the business strategy; what data is needed, how to get • Business Case
it, how it will be managed and utilized. • Guiding Principles
• Mission and long –term goals
• Measures
Data Management Strategy • SMART* objectives
• Roles & Responsibilities
Data Strategy requires a supporting Data Management Strategy – a • Descriptions of program
plan for maintaining and improving the quality of data, integrity of components & initiatives
data, access and security while mitigating risks and addressing Data • Prioritize program with scope
Management challenges. boundaries
• Roadmap
Roles
Deliverables:
Typically owned by CDO and enacted through a Data Governance • DM Charter
team, supported by a Data Governance Council • DM Scope Statement
• DM Implementation Roadmap
New England Data Management Community
DATA MANAGEMENT Frameworks
Data management involves a set of interdependent function, each
with its own goals, activities and responsibilities.
Frameworks are developed at different levels of abstraction to provide
perspectives to help us to understand data management
comprehensively and see relationships between components
Many factors influence DM approach such as: industry, range of data,
culture, maturity level, strategy, vision and challenges
Frameworks
Strategic Alignment Model
The Amsterdam Information Model
DAMA-DMBOK Framework
DMBOK Pyramid
DAMA Data Management Framework Evolved
New England Data Management Community
DAMA AND THE DMBOK
DAMA was founded to address the challenges of data management
The DMBOK is an accessible, authoritative reference book for data management professionals and supports
the DAMA mission by:
Providing a functional framework
Establishing and common vocabulary
Serving as a fundamental reference guide
There are 11 Knowledge Areas in the DAMA-DMBOK:
Data Governance
Data Architecture
Data Modeling and Design
Data Storage and Operations
Data Security
Data Integration and Interoperability
Document and Content Management
Reference and Master Data
Data Warehousing and Business Intelligence
Metadata
Data Quality
New England Data Management Community
Q&A
New England Data Management Community
NEXT SESSION
Date Topic Facilitator
February 19th Chapter 1: Data Management Tony Mazzarella
March 4th Chapter 2: Data Handling Ethics Lynn Noel
March 18th Chapter 3: Data Governance Sandi Perillo-Simmons
April 1st Chapter 4: Data Architecture Laura Sebastian Coleman
April 15th Chapter 5: Data Modeling & Design Lynn Noel
April 29th Chapter 6: Data Storage & Operations Karen Sheridan
May 13th Chapter 7: Data Security Laura Sebastian-Coleman
May 27th Chapter 8: Data Integration & Interoperability Mary Early
June 10th Chapoter 9: Document & Content Management Sandi Perillo-Simmons
June 24th Chapter 10: Reference & Master Data Mary Early
July 8th Chapter 11: Data Warehousing & Business Intelligence Tony Mazzarella
July 22nd Chapter 12: Metadata Management Karen Sheridan
August 5th Chapter 13: Data Quality Laura Sebastian-Coleman
August 19th Chapter 14: Big Data & Data Science Nupur Gandhi
September 2nd Chapter 15: Data Management Maturity Assessment Laura Sebastian-Coleman
September 16th Chapter 16: Data Management Organization & Role Expectations Agnes Vega
September 30th Chapter 17: Data Management & Organizational Change Management Tony Mazzarella
October 7th Final Review Tony Mazzarella
New England Data Management Community
NEXT SESSION HOMEWORK
What do you think are the greatest challenges and opportunities
for data handling ethics posted by artificial intelligence and
machine learning? Where and how well does this week’s DMBoK
chapter address the ethics of machine data handling?
New England Data Management Community