Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@dolfinus
Copy link
Contributor

@dolfinus dolfinus commented May 28, 2025

Problem

Currently dbt integration uses custom facet dbt_version instead of standard facet processing_engine reported by all other integrations. Let's fix that.

Solution

One-line summary:

DBT: Add processing_engine facet

Checklist

  • You've signed-off your work
  • Your pull request title follows our guidelines
  • Your changes are accompanied by tests (if relevant)
  • Your change contains a small diff and is self-contained
  • You've updated any relevant documentation (if relevant)
  • Your comment includes a one-liner for the changelog about the specific purpose of the change (not required for changes to tests, docs, or CI config)
  • You've versioned the core OpenLineage model or facets according to SchemaVer (if relevant)
  • You've added a header to source files (if relevant)

SPDX-License-Identifier: Apache-2.0
Copyright 2018-2025 contributors to the OpenLineage project

@dolfinus dolfinus requested a review from a team as a code owner May 28, 2025 14:36
@boring-cyborg boring-cyborg bot added area:integration/common openlineage-integration-common area:tests Testing code language:python Uses Python programming language labels May 28, 2025
@dolfinus dolfinus force-pushed the improvement/dbt-processing-engine branch from 8cdaf0e to bf8bda7 Compare May 28, 2025 14:58
@codecov-commenter
Copy link

codecov-commenter commented May 28, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.13%. Comparing base (98d8ed8) to head (b49b032).
Report is 1 commits behind head on main.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3725      +/-   ##
==========================================
+ Coverage   85.86%   86.13%   +0.26%     
==========================================
  Files          57       57              
  Lines        3651     3657       +6     
==========================================
+ Hits         3135     3150      +15     
+ Misses        516      507       -9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@kacpermuda kacpermuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving nit comments. Also not sure if we should hardcode versions in tests, as well as use any instead of some regex for version validation, but given that this is not introduced in this PR we can improve it in separate PR.

@dolfinus dolfinus force-pushed the improvement/dbt-processing-engine branch 3 times, most recently from 3ab4229 to 8eed3f1 Compare May 29, 2025 07:56
@dolfinus dolfinus force-pushed the improvement/dbt-processing-engine branch from 8eed3f1 to b49b032 Compare May 29, 2025 08:04
@mobuchowski mobuchowski merged commit 8bcc60e into OpenLineage:main May 30, 2025
47 checks passed
@dolfinus dolfinus deleted the improvement/dbt-processing-engine branch May 30, 2025 09:13
marccampa pushed a commit to marccampa/OpenLineage-Collibra that referenced this pull request Jun 26, 2025
Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>
mobuchowski added a commit that referenced this pull request Jul 3, 2025
* Update consumers.tsx

* Add files via upload

* Update consumers.tsx

* Add files via upload

* Update consumers.tsx

* Update consumers.tsx

* Update consumers.tsx

* Update consumers.tsx

* Delete Collibra-Logo-RGB.png

* Update consumers.tsx

* [Flink] Do not hide OpenLineage config parsing errors (#3724)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Update consumers.tsx

Signed-off-by: marccampa <[email protected]>

* Add files via upload

Signed-off-by: marccampa <[email protected]>

* Update consumers.tsx

Signed-off-by: marccampa <[email protected]>

* Add files via upload

Signed-off-by: marccampa <[email protected]>

* Update consumers.tsx

Signed-off-by: marccampa <[email protected]>

* [DBT] Add processing_engine facet (#3725)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* java: prevent original events from being mutated in TransformTransport (#3728)

- Add deepCopy utility method to OpenLineageClientUtils for safe object cloning
- Modify TransformTransport to create deep copies of events before transformation

Signed-off-by: Jakub Dardzinski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [Flink] Add processing_engine facet (#3726)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Add Github stars statistics to Readme (#3730)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [DBT] Document supported adapters (#3729)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Prettify Spark JSON event examples (#3740)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Prettify Flink JSON event examples (#3742)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Prettify Airflow JSON event examples (#3741)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Prettify DBT JSON event examples (#3743)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Restyle cardmedia. (#3733)

Signed-off-by: merobi-hub <[email protected]>
Signed-off-by: marccampa <[email protected]>

* dbt-ol should not error on job complete if there is no start event (#3749)

Signed-off-by: Maciej Obuchowski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [Flink] Add facet with Flink jobId (#3744)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Update consumers.tsx

Signed-off-by: marccampa <[email protected]>

* Update consumers.tsx

Signed-off-by: marccampa <[email protected]>

* [DBT] Initial support for Clickhouse (#3739)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [SPEC] Add contentType to documentation facet (#3748)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [spark] Update Spark 4 dependency to 4.0.0 (remove -preview1 suffix) (#3751)

Signed-off-by: Dominik Dębowczyk <[email protected]>
Signed-off-by: marccampa <[email protected]>

* filter temp inner jobs for bigquery indirect mode (#3722)

Signed-off-by: Pawel Leszczynski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [Docs] Add documentation for some facets (#3752)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Tweak the Maven signing config (#3069)

This tweak allows Gradle to default on using values set in `~/.gradle/gradle.properties`

Signed-off-by: Julien Phalip <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Run prettier on .json files (#3750)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* remove native proxy (#3680)

* remove native proxy

Signed-off-by: Maciej Obuchowski <[email protected]>

# Conflicts:
#	proxy/backend/gradle.properties

* remove leftover proxy gradle reference

Signed-off-by: Kacper Muda <[email protected]>

---------

Signed-off-by: Kacper Muda <[email protected]>
Co-authored-by: Kacper Muda <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [DBT] Add DbtRun facet (#3738)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Remove Airflow < 2.5.0 support (#3669)

Signed-off-by: Kacper Muda <[email protected]>
Signed-off-by: marccampa <[email protected]>

* nit: fix supported airflow versions (#3755)

Signed-off-by: Kacper Muda <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [Java] Speedup generateNewUUID (#3754)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [DBT] Use adapter rows_affected as outputStatistics (#3731)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* fix variables in docs for setting of root parents in spark config (#3761)

Signed-off-by: Humzah Kiani <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [spark] Add support for Big Query Metastore catalog type (#3760)

Signed-off-by: Dominik Dębowczyk <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Fix visibility of GcpLineageTransportConfig.Mode (#3762)

* Register GCP common job facet

Signed-off-by: Natalia Gorchakova <[email protected]>

* Add ACCEPT_CASE_INSENSITIVE_ENUMS for ObjectMapper to ensure that lower and upper case enum values are accepted for config

Signed-off-by: Natalia Gorchakova <[email protected]>

* Add ACCEPT_CASE_INSENSITIVE_ENUMS for ObjectMapper to ensure that lower and upper case enum values are accepted for config

Signed-off-by: Natalia Gorchakova <[email protected]>

* Add ACCEPT_CASE_INSENSITIVE_ENUMS for ObjectMapper to ensure that lower and upper case enum values are accepted for config

Signed-off-by: Natalia Gorchakova <[email protected]>

* Add ACCEPT_CASE_INSENSITIVE_ENUMS for ObjectMapper to ensure that lower and upper case enum values are accepted for config

Signed-off-by: Natalia Gorchakova <[email protected]>

---------

Signed-off-by: Natalia Gorchakova <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Update consumers.tsx

Signed-off-by: marccampa <[email protected]>

* Delete Collibra-Logo-RGB.png

Signed-off-by: marccampa <[email protected]>

* update  httpConfig Headers and TimeoutInMillis property values (#3767)

Signed-off-by: Nidhin Varghese <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [java] Add log if load from yaml fails (#3766)

Signed-off-by: Fiore Mario Vitale <[email protected]>
Signed-off-by: marccampa <[email protected]>

* smart debug facet (#3715)

Signed-off-by: Pawel Leszczynski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [Spark] Fix missing table path in InsertIntoHadoopFsRelationCommand (#3773)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Github: mark Hive PRs with proper label (#3778)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* fix configurable test failin in CI (#3782)

Signed-off-by: Pawel Leszczynski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Column level lineage for jdbc queries load (#3763)

* test column level lineage for jdbc queries load

Signed-off-by: Pawel Leszczynski <[email protected]>

* refactor jdbc lineage visitor

Signed-off-by: Pawel Leszczynski <[email protected]>

---------

Signed-off-by: Pawel Leszczynski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* chore: Use attr.define instead of attr.s (#3776)

Signed-off-by: Kacper Muda <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [Hive] Add job sql facet (#3777)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* build(deps): bump the integration-sql group (#3704)

Updates the requirements on [pyo3](https://github.com/pyo3/pyo3) and [pyo3-build-config](https://github.com/pyo3/pyo3) to permit the latest version.

Updates `pyo3` to 0.25.0
- [Release notes](https://github.com/pyo3/pyo3/releases)
- [Changelog](https://github.com/PyO3/pyo3/blob/main/CHANGELOG.md)
- [Commits](PyO3/pyo3@v0.24.0...v0.25.0)

Updates `pyo3-build-config` to 0.25.0
- [Release notes](https://github.com/pyo3/pyo3/releases)
- [Changelog](https://github.com/PyO3/pyo3/blob/main/CHANGELOG.md)
- [Commits](PyO3/pyo3@v0.24.0...v0.25.0)

---
updated-dependencies:
- dependency-name: pyo3
  dependency-version: 0.25.0
  dependency-type: direct:production
  dependency-group: integration-sql
- dependency-name: pyo3-build-config
  dependency-version: 0.25.0
  dependency-type: direct:production
  dependency-group: integration-sql
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: marccampa <[email protected]>

* [Hive] Add hive_query facet (#3781)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Website: correct Node version in README (#3783)

* Fix node version in website readme.

Signed-off-by: merobi-hub <[email protected]>

* Misc fixes.

Signed-off-by: merobi-hub <[email protected]>

---------

Signed-off-by: merobi-hub <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [spark] Disable module metadata file generation (#3785)

Signed-off-by: Dominik Dębowczyk <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Add Debezium to producers (#3787)

Signed-off-by: Fiore Mario Vitale <[email protected]>
Signed-off-by: marccampa <[email protected]>

* build(deps): bump requests from 2.32.0 to 2.32.4 in /dev (#3759)

Bumps [requests](https://github.com/psf/requests) from 2.32.0 to 2.32.4.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](psf/requests@v2.32.0...v2.32.4)

---
updated-dependencies:
- dependency-name: requests
  dependency-version: 2.32.4
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: marccampa <[email protected]>

* [Hive] Add hive_session facet (#3786)

* [Hive] Add hive_session facet

Signed-off-by: Martynov Maxim <[email protected]>

* [Hive] Record hive session creation time

Signed-off-by: Martynov Maxim <[email protected]>

---------

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* changelog for release 1.34.0 (#3790)

Signed-off-by: Maciej Obuchowski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Prepare for release 1.34.0

Signed-off-by: Maciej Obuchowski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Prepare next development version 1.35.0-SNAPSHOT

Signed-off-by: Maciej Obuchowski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* chore: Fix changelog item authors (#3791)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [Hive] Add jobType facet (#3789)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Website: update README (#3801)

* Update website readme.

Signed-off-by: merobi-hub <[email protected]>

* Fix code blocks.

Signed-off-by: merobi-hub <[email protected]>

* Fix wordiness.

Signed-off-by: merobi-hub <[email protected]>

* More details in deployment sec.

Signed-off-by: merobi-hub <[email protected]>

* Continued.

Signed-off-by: merobi-hub <[email protected]>

* Continued.

Signed-off-by: merobi-hub <[email protected]>

* Continued.

Signed-off-by: merobi-hub <[email protected]>

---------

Signed-off-by: merobi-hub <[email protected]>
Signed-off-by: marccampa <[email protected]>

* fix spotless in hive integration (#3806)

Signed-off-by: Maciej Obuchowski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* run Java SQL tests (#3808)

Signed-off-by: Maciej Obuchowski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [Hive] Add docker-compose example for local testing (#3800)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* [DBT] Make invocation_id field optional (#3796)

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Remove empty Flink page. (#3810)

Signed-off-by: Jakub Dardzinski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Flink integration: Fixed a bug incorrectly loading configuration in Event Emitter (#3799)

* Flink integration: Fixed a bug incorrectly loading configuration in Event Emitter, resulting in "disabled facets" feature not working (and probably others as well).
Signed-off-by: Jan Siekierski <[email protected]>

* Flink integration: Fixed a bug incorrectly loading configuration in Event Emitter, resulting in "disabled facets" feature not working (and probably others as well).
Signed-off-by: Jan Siekierski <[email protected]>

---------

Co-authored-by: Jan Siekierski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Website: add missing guidance to readme (#3807)

* Add missing guidance to readme.

Signed-off-by: merobi-hub <[email protected]>

* Img file formats.

Signed-off-by: merobi-hub <[email protected]>

---------

Signed-off-by: merobi-hub <[email protected]>
Signed-off-by: marccampa <[email protected]>

* build(deps): bump urllib3 from 1.26.19 to 2.5.0 in /dev (#3794)

Bumps [urllib3](https://github.com/urllib3/urllib3) from 1.26.19 to 2.5.0.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](urllib3/urllib3@1.26.19...2.5.0)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.5.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: marccampa <[email protected]>

* dbt: fix log path, more precise file reading (#3793)

Signed-off-by: Maciej Obuchowski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Spark: fix & upgrade databricks test (#3811)

Signed-off-by: Pawel Leszczynski <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Formalize dataset naming (#3775)

* Formalize dataset naming

---------

Signed-off-by: Dominik Dębowczyk <[email protected]>
Signed-off-by: marccampa <[email protected]>

* Update consumers.tsx

Signed-off-by: marccampa <[email protected]>

* Apply prettier fix.

Signed-off-by: merobi-hub <[email protected]>

---------

Signed-off-by: Martynov Maxim <[email protected]>
Signed-off-by: marccampa <[email protected]>
Signed-off-by: Jakub Dardzinski <[email protected]>
Signed-off-by: merobi-hub <[email protected]>
Signed-off-by: Maciej Obuchowski <[email protected]>
Signed-off-by: Dominik Dębowczyk <[email protected]>
Signed-off-by: Pawel Leszczynski <[email protected]>
Signed-off-by: Julien Phalip <[email protected]>
Signed-off-by: Kacper Muda <[email protected]>
Signed-off-by: Humzah Kiani <[email protected]>
Signed-off-by: Natalia Gorchakova <[email protected]>
Signed-off-by: Nidhin Varghese <[email protected]>
Signed-off-by: Fiore Mario Vitale <[email protected]>
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Maxim Martynov <[email protected]>
Co-authored-by: Jakub Dardzinski <[email protected]>
Co-authored-by: Michael Robinson <[email protected]>
Co-authored-by: Maciej Obuchowski <[email protected]>
Co-authored-by: ddebowczyk92 <[email protected]>
Co-authored-by: pawel.leszczynski <[email protected]>
Co-authored-by: Julien Phalip <[email protected]>
Co-authored-by: Kacper Muda <[email protected]>
Co-authored-by: Humzah Kiani <[email protected]>
Co-authored-by: ngorchakova <[email protected]>
Co-authored-by: Nidhin Varghese <[email protected]>
Co-authored-by: Fiore Mario Vitale <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Fiore Mario Vitale <[email protected]>
Co-authored-by: Maciej Obuchowski <[email protected]>
Co-authored-by: Jan Siekierski <[email protected]>
Co-authored-by: Jan Siekierski <[email protected]>
Co-authored-by: merobi-hub <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:integration/common openlineage-integration-common area:tests Testing code language:python Uses Python programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants