Spark: add default name to hive catalog facet #4161

tnazarew · 2025-11-20T12:30:20Z

Problem

When Spark job is configured with default hive metastore catalog like

        SparkSession.builder()
            .master("local[*]")
            .config("spark.sql.catalogImplementation", "hive")
            .config("spark.sql.warehouse.dir", "...")
            .config("hive.metastore.uris", "...")
            ...
            .enableHiveSupport().getOrCreate();

The CatalogDatasetFacet doesn't have the name field set, which is a problem because it's a required property in the facet spec.

Solution

As the name is not set in the case of this catalog, we can use just any default, I chose default.

One-line summary:

Add missing name property to CatalogDatasetFacet in case of default hive catalog to make it compatible with the spec.

Checklist

You've signed-off your work
Your pull request title follows our guidelines
Your changes are accompanied by tests (if relevant)
Your change contains a small diff and is self-contained
You've updated any relevant documentation (if relevant)
Your comment includes a one-liner for the changelog about the specific purpose of the change (not required for changes to tests, docs, or CI config)
You've versioned the core OpenLineage model or facets according to SchemaVer (if relevant)
You've added a header to source files (if relevant)

SPDX-License-Identifier: Apache-2.0
Copyright 2018-2025 contributors to the OpenLineage project

Signed-off-by: tnazarew <[email protected]>

tnazarew requested a review from a team as a code owner November 20, 2025 12:30

boring-cyborg bot added area:integration/spark language:java Uses Java programming language area:tests Testing code labels Nov 20, 2025

tnazarew added 2 commits November 20, 2025 13:39

add default name to hive catalog

e21a79c

Signed-off-by: tnazarew <[email protected]>

update tests

72d04a9

Signed-off-by: tnazarew <[email protected]>

tnazarew force-pushed the spark/missing-name-for-hive-catalog branch from f97b30c to 72d04a9 Compare November 20, 2025 12:40

mobuchowski approved these changes Nov 24, 2025

View reviewed changes

mobuchowski merged commit f95a489 into main Nov 24, 2025
30 checks passed

mobuchowski deleted the spark/missing-name-for-hive-catalog branch November 24, 2025 12:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Spark: add default name to hive catalog facet #4161

Spark: add default name to hive catalog facet #4161

Uh oh!

tnazarew commented Nov 20, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Spark: add default name to hive catalog facet #4161

Spark: add default name to hive catalog facet #4161

Uh oh!

Conversation

tnazarew commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

One-line summary:

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tnazarew commented Nov 20, 2025 •

edited

Loading