Data Quality Management
Learning Objectives
At the end of this session, participants will be
able to:
Summarize basic terminology regarding data
quality management
List and describe the 5 threats to quality data
Identify possible threat to data quality in an
information system
Generate a plan to managaged identify threats to
data quality
Why M&E is important
M&E promotes organizational learning and
encourages adaptive management
Content of Workshop Session
What is Data Quality?
The criteria for data quality
Constructing a data quality plan
Data quality auditing
What is Data Quality ?
Chess game of cost versus quality
Criterion based evaluation of data
Criterion based system of data management
Dimensions of Data Quality
Validity
Reliability
Timeliness
Precision
Integrity
Definition of Validity
A characteristic of measurement in which a
tool actually measures what the researcher
intended to measure
Have we actually measured what we
intended?
Threats to Validity
Definitional issues
Proxy measures
Inclusions / Exclusions
Data sources
Validity:
Questions to ask yourself…
Is there a relationship between the activity or
program and what you are measuring?
What is the data transcription process? Is there
potential for error?
Are steps being taken to limit transcription error
(e.g., double keying of data for large surveys,
built in validation checks, random checks)?
Definition of Reliability
‘A characteristic of measurement concerned
with consistency’
Can we consistently measure what we
intended?
Threats to Reliability I
Time
Place
People
Threats to Reliability II
Collection methodologies
Collection instruments
Personnel issues
Analysis and manipulation methodologies
Reliability:
Questions to ask yourself…
Is the same instrument used from year to year,
site to site?
Is the same data collection process used from
year to year, site to site?
Are there procedures in place to ensure that
data are free of significant error and that bias
is not introduced (e.g., instructions, indicator
information sheets, training, etc.)?
Definition of Timeliness
The relationship between the time of collection,
collation and reporting to the relevance of the
data for decision making processes.
Does the data still have relevance and value
when reported?
Threats to Timeliness
Collection frequencies
Reporting frequencies
Time dependency
Timeliness:
Questions to ask yourself…
Are data available on a frequent enough basis to
inform program management decisions?
Is a regularized schedule of data collection in place
to meet program management needs?
Are data from within the reporting period of interest
(i.e. are the data from a point in time after the
intervention has begun)?
Are the data reported as soon as possible after
collection?
Definition of Precision
Accuracy (measure of bias)
Precision (measure of error)
Is the margin of error in the data less than the
expected change the project was designed to
effect?
Threats to Precision
Source error / bias
Instrumentation error
Transcription error
Manipulation error
Precision:
Questions to ask yourself…
Is the margin of error less than expected change
being measured?
Are the margins of error acceptable for program
decision making?
Have issues around precision been reported?
Would an increase in the degree of accuracy be
more costly than the increased value of the
information?
Good Data are Valid, Reliable and Precise
≠ Accurate/Valid ≠ Accurate/Valid Accurate/Valid
≠ Reliable Reliable Reliable
≠ Precise Precise Precise
X X X XXX
XX XXX
XXXX
X X
XXX
X XXXX
X XXX
X
Definition of Integrity
Measure of ‘truthfulness’ of the data
Is the data free from ‘untruth’ introduced
by either human or technical means,
whether willfully or unconsciously?
Threats to Integrity I
Time
Temptation
Technology
Threats to Integrity II
Corruption, intentional or unintentional
Personal manipulations
Technological failures
Lack of audit verification and validation
The Data Quality Plan
Operational Plan for managing data quality
Indicator Information Sheets
Includes a Data Quality Risk Analysis
Includes an audit trail reference
Framework for Data Quality
Assessments
Data Management System Data Quality System Auditable
Risk Verification
System
Data Management Processes / Data Quality Processes /
Procedures Procedures
Source
Collection
Collation Validity
Analysis Paper Trail
Reliability that allows
Reporting verification of
Integrity the entire
DMS and the
data produced
Precision within it
Usage
Timeliness
Relationships with a Data System
Data Quality Issues at SOURCE
The potential risk of poor data quality increases
with secondary and tertiary data sources
Examples:
Validity: data could be incomplete (incomplete Drs
notes, ineligible notes in patient files)
Reliability: inconsistent recording of information by
different staff because of differing skills levels
To Ensure Data Quality at
SOURCE
Design instruments carefully and correctly
Include data providers (community stakeholders)
and data processors in decision to establish what
is feasible to collect, to review process, and to
draft instruments.
Develop & document instructions for the data
collectors, on the collection forms, and for
computer procedures
To Ensure Data Quality at
SOURCE
Ensure all personnel are trained in their assigned
task. Use 1 trainer if possible
Develop an appropriate sample
Data Quality Issues at
COLLECTION
Incomplete entries in spreadsheets
incorrect data transcriptions
data entered in wrong fields in a database
Inconsistent entries of data by different data
capturers
To ensure data quality during
COLLECTION
Develop specific instructions for data collection
Routinely check to see if instructions are being
followed
Identify what to do if you (or someone) wants to
make a change to the data collection process or if
you have problems during data collection (change
mgmt process)
Check to see if people follow the change
management process
Ensure all data collection, entry and analysis needs
are available (pens, paper, forms, computers)
To ensure data quality during
COLLECTION
Train data collectors in how to collect information
Develop SOPs for managing the collected data
(e.g. moving data from 1 point to the next)
Develop SOPs for revising the collection tool
Communicate the process and SOPs
Conduct on-site reviews during the process
To ensure data quality during
COLLATION
Develop check lists and sign off for key steps
Conduct reviews during entry process
Create an electronic or manual format that
includes a data verification process by a second
individual who is not entering the data
To ensure data quality during
COLLATION
Randomly sample data and verify
Ensure problems are reported and documented,
corrected and communicated and tracked back
to the source of the problem
To ensure data quality during
ANALYSIS
Ensure analysis techniques meet the
requirements for proper use
Disclose all conditions /assumptions affecting
interpretations for data
Have experts review reports for reasonableness
of analysis
To ensure data quality during
REPORTING
Synthesize results for the appropriate audience
Maintain integrity in reporting – don’t leave out
key info
Have multiple reviewers within the organization -
prior to dissemination!
Protect confidentiality in reports / communication
tools
Review data / provide feedback with those who
have a stake in the results
To ensure data quality during
USAGE
Understand your data !!
Use your data !!
Minimizing Data Quality Risks
Technology
Ensuring that data analysis/statistical software is up-to-date.
Streamlining instruments and data collection methods.
Competence of personnel
Ensuring that staff are well-versed in all stages of the Data
Management process (data collection, entry, assessment, risk
analysis, etc).
Proficiency with data software.
Documentation and audit trails
Outsourcing
Data Quality Audits
Verification
Validation
Self-assessment
Internal audit
External audit
DQA Process
Data quality training
Close non-
Data quality plans
compliances
Construct audit plan
Correct data Plan
practices
Clean database
Act Do
Review self- Check Self-evaluation
evaluations Data input
Audit input from Run error logs
partners Report
Review error logs generation
Audit data in database
Audit the output The auditor is responsible for
reports the areas indicated in yellow
Submit audit report
M&E Work Plan tasks
Identify the risks associated with your current data
management practices and assign a risk value to them
Identify the contingency plans needed to improve the
data quality practice
Complete a Data Quality Plan for one of the Indicators
you will be reporting against.
Acknowledgements
This presentation was the result of on-going
collaborations between:
USG – The President’s Emergency Plan for
AIDS Relief in South Africa
USAID
MEASURE Evaluation
Khulisa Management Services
MEASURE Evaluation is funded by the U.S. Agency for
International Development (USAID) through Cooperative
Agreement GPO-A-00-03-00003-00 and is implemented by
the Carolina Population Center at the University of North
Carolina in partnership with Futures Group, John Snow,
Inc., Macro International, and Tulane University. Visit us
online at http://www.cpc.unc.edu/measure