Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
23 views1 page

Data Lineage

The document outlines the challenges and desired outcomes of data lineage, emphasizing the need for specific success metrics such as time saved on troubleshooting and improved data quality. It highlights the importance of identifying critical data elements, assessing existing infrastructure, and implementing automated lineage tracking to ensure reliable data management. The document also suggests focusing on scenarios that provide immediate value and leveraging tools like OpenLineage and Collibra for effective data lineage tracking.

Uploaded by

vaibhavag404
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views1 page

Data Lineage

The document outlines the challenges and desired outcomes of data lineage, emphasizing the need for specific success metrics such as time saved on troubleshooting and improved data quality. It highlights the importance of identifying critical data elements, assessing existing infrastructure, and implementing automated lineage tracking to ensure reliable data management. The document also suggests focusing on scenarios that provide immediate value and leveraging tools like OpenLineage and Collibra for effective data lineage tracking.

Uploaded by

vaibhavag404
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

• Outline current data challenges and desired outcomes from data lineage

• Define Specific, Measurable, Achievable, Relevant, and Time-bound critical


success factors against which the POV's success will be measured.
• Examples of Success Metrics:
• Time saved on data issue troubleshooting.
• Improved data quality metrics (e.g., accuracy, completeness).
• Reduced regulatory compliance risks.
• Faster impact analysis for data changes.
• Increased trust in data assets for decision-making
• Identify the most critical data elements and systems that will be the focus of the
POV
• Decide whether you need technical lineage (physical movement) or business
lineage (logical flow), and the level of detail required
• Select specific scenarios where data lineage can deliver immediate and
measurable value, like troubleshooting a persistent data quality issue or
demonstrating compliance for a key report.
• Assess Existing Infrastructure: Evaluate your current data sources, processing
systems, and storage solutions involved in the chosen data workflows.
• Gather Relevant Metadata: Identify and collect existing metadata about the
prioritised data assets, including schemas, data types, transformation rules, and
data quality metrics. Ensure Data Quality: Implement data quality controls to
ensure the accuracy and consistency of the data used in the POV, as the lineage
will only be as reliable as the underlying data
• Implement data lineage tracking. Focus on Automated Lineage: Leverage
automated tracking capabilities to minimise manual effort and ensure real-time
or near real-time updates to lineage information
• Assess the impact of potential changes or actual data quality issues by
leveraging the lineage information
• OpenLineage, Collibra

1. Time, Quality, Changes


2. Regulatory and Trust for decision making
3. Find Critical data elements or systems to focus on for POV
4. Technical or business plan
5. Scenarios where data lineage can deliver immediate, measurable value
6. Implement points one and two on prioritised data elements after collection of
metadata, which includes schema, data types, transformational rules and data
quality metrics
7. Automated tracking using openlineage or collabora

You might also like