Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@jroachgolf84
Copy link
Contributor

@jroachgolf84 jroachgolf84 commented Jul 11, 2025

Problem

In Airflow, operators like the SqlExecuteQueryOperator emit rich lineage data, including query ID. This query ID is not available when using the Astronomer Cosmos project in Airflow.

Closes: NONE

Solution

By parsing the run_results, we're able to parse the query_id from the adapter_response. This is then added to the externalQuery in run_facets. This provides the same functionality as what is seen in the SQLExecuteQueryOperator.

  • Your change modifies the core OpenLineage model
  • Your change modifies one or more OpenLineage facets

One-line summary: Adding query ID's for dbt

Checklist

  • You've signed-off your work
  • Your pull request title follows our guidelines
  • Your changes are accompanied by tests (if relevant)
  • Your change contains a small diff and is self-contained
  • You've updated any relevant documentation (if relevant)
  • Your comment includes a one-liner for the changelog about the specific purpose of the change (not required for changes to tests, docs, or CI config)
  • You've versioned the core OpenLineage model or facets according to SchemaVer (if relevant)
  • You've added a header to source files (if relevant)

SPDX-License-Identifier: Apache-2.0
Copyright 2018-2025 contributors to the OpenLineage project

@jroachgolf84 jroachgolf84 requested a review from a team as a code owner July 11, 2025 12:20
@boring-cyborg boring-cyborg bot added area:integration/common openlineage-integration-common language:python Uses Python programming language labels Jul 11, 2025
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 11, 2025

Thanks for opening your first OpenLineage pull request! We appreciate your contribution. If you haven't already, please make sure you've reviewed our guide for new contributors (https://github.com/OpenLineage/OpenLineage/blob/main/CONTRIBUTING.md).

Copy link
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this, @jroachgolf84! We may need to add some tests to cover this change. We may have to update the documentation - is the facet externalQuery new or did it exist before?

@kacpermuda please, could you take a look and give feedback?

@mobuchowski
Copy link
Member

@jroachgolf84
Copy link
Contributor Author

We may need to add some tests to cover this change. We may have to update the documentation - is the facet externalQuery new or did it exist before?

Agreed! There are very few unit-tests to be found for dbt, and not speficially for the DbtArtifactProcessor. Happy to create some! BTW, I should have marked this PR as a draft.

@jroachgolf84
Copy link
Contributor Author

@jroachgolf84 some other connectors are using different fields, like BigQuery and Glue, could you make this field depend on adapter type?

@mobuchowski - for these adapters, is the field name not adapter_response.query_id? Is that only applicable for Snowflake?

@mobuchowski
Copy link
Member

@jroachgolf84 I've looked at few, and query_id works for Snowflake and Databricks, but not BigQuery (job_id) or Glue (I think it's statement.id). Some others don't have any query id, like Athena or Redshift.

@jroachgolf84
Copy link
Contributor Author

@mobuchowski, perfect, thanks for the heads up. I'll go ahead and update accordingly.

Do you have any recommendations on the best way to test this? I'd lean towards writing tests only for the get_run() method to start, which would include tests for multiple different connectors. Thoughts? First time contributing to this code-base for me, and I haven't been able to find docs for running unit/integration tests.

@mobuchowski
Copy link
Member

@jroachgolf84 we've just added new dbt integration tests: #3872

however they only run duckdb adapter, and it does not include query id in the adapter response.
Maybe you're willing to add Snowflake ones - the relevant connection details should be in the CI test environment.

@jroachgolf84
Copy link
Contributor Author

@mobuchowski, I will most certainly take a look at this. In the meantime, do you have any docs about actually executing unit-tests that have been written already?

@boring-cyborg boring-cyborg bot added the area:tests Testing code label Jul 14, 2025
@jroachgolf84
Copy link
Contributor Author

@mobuchowski, I've gone ahead and added unit-tests for my changes to the DbtArtifactProcessor.

Copy link
Contributor

@kacpermuda kacpermuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments. It looks good, but maybe we can improve the default path a bit to support more cases.

@jroachgolf84
Copy link
Contributor Author

@kacpermuda, I've implemented each of the changes you outlined. Do you mind taking another look?

Copy link
Contributor

@kacpermuda kacpermuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mobuchowski mobuchowski merged commit c6e24d0 into OpenLineage:main Jul 16, 2025
25 checks passed
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 16, 2025

Great job! Congrats on your first merged pull request in OpenLineage!

@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 93.33333% with 1 line in your changes missing coverage. Please review.

Project coverage is 85.11%. Comparing base (6e18732) to head (858833f).
Report is 7 commits behind head on main.

Files with missing lines Patch % Lines
...ommon/openlineage/common/provider/dbt/processor.py 93.33% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3890      +/-   ##
==========================================
+ Coverage   85.06%   85.11%   +0.05%     
==========================================
  Files          57       57              
  Lines        3809     3822      +13     
==========================================
+ Hits         3240     3253      +13     
  Misses        569      569              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:integration/common openlineage-integration-common area:tests Testing code language:python Uses Python programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants