Codestin Search App

SemyonSinchenko · 2025-05-27T10:25:07Z

What changes were proposed in this pull request?

cbt-ci-release
updates in build.sbt
new databricks-connect subproject

Why are the changes needed?

Part of #510

codecov-commenter · 2025-05-27T10:29:38Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.85%. Comparing base (bc487ef) to head (71040e0).
Report is 31 commits behind head on master.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #598      +/-   ##
==========================================
- Coverage   91.43%   89.85%   -1.58%     
==========================================
  Files          18       21       +3     
  Lines         829     1065     +236     
  Branches       52      119      +67     
==========================================
+ Hits          758      957     +199     
- Misses         71      108      +37

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

SemyonSinchenko · 2025-05-27T11:00:31Z

To check the result:

./build/sbt
root / publishLocal
connect / publishLocal
databricksConnect / publishLocal

…ish-ci

SemyonSinchenko · 2025-05-27T17:05:36Z

With a single command:

./build/sbt "root/publishLocal; connect/publishLocal; databricksConnect/publishLocal"

james-willis · 2025-05-27T20:36:58Z

Is there an issue for why we need databricks-specific logic? should we link that explanation here in the PR?

SemyonSinchenko · 2025-05-27T20:39:10Z

Is there an issue for why we need databricks-specific logic? should we link that explanation here in the PR?

mrpowers-io/tsumugi-spark#55 (comment)

For unknown for me reason, databricks engineers are using an own shading rule that is different from an OSS one.

james-willis

I really dont like the idea of having a databricks specific distribution. However I don't have a better option here. I asked around a little bit if there is something else we can do that is vendor agnostic.

I will approve once the databricks typo is fixed

build.sbt

SemyonSinchenko · 2025-05-28T11:02:04Z

@WeichenXu123 Hello! Do you know, can we use the word "databricks" in the name of published artifacts (graphframes-databricks-connect) without violation of the trademark? Is there a better way to allow databricks users to use graphframes on "shared access" clusters?

cc: @rjurney

MrPowers · 2025-05-29T18:27:01Z

LGTM!

rjurney

lgtm

SemyonSinchenko · 2025-05-30T09:18:53Z

I'm waiting for the approve from @dmatrix about can we use the word "databricks" in the name of released artifacts, code and documentation.

WeichenXu123 · 2025-05-30T12:18:54Z

@WeichenXu123 Hello! Do you know, can we use the word "databricks" in the name of published artifacts (graphframes-databricks-connect) without violation of the trademark? Is there a better way to allow databricks users to use graphframes on "shared access" clusters?

cc: @rjurney

I suggest to avoid using "databricks" in the name, but using name like graphframes-spark-connect.

but I think supporting Spark connect is not sufficient to make it work on Databricks shared cluster, we also need to whitelist a bunch of JVM methods in the lib for running on Databricks shared cluster (and needs our security team's approval for doing that)

Kimahriman · 2025-06-03T12:50:18Z

Does the API change from protobuf.Any to Array[Byte] in Spark 4 fix the shading/binary compatibility issue, at least in Spark 4+? Might be a good compromise if it does to prevent a dedicated artifact for a closed source fork

SemyonSinchenko · 2025-06-03T13:02:19Z

Does the API change from protobuf.Any to Array[Byte] in Spark 4 fix the shading/binary compatibility issue, at least in Spark 4+? Might be a good compromise if it does to prevent a dedicated artifact for a closed source fork

I did not try but for me it looks like this change in Spark 4.0.x should resolve the mentioned problem.

1. Dropped a databricksConnect subproject 2. Shading rule is a variable from now with a default eq to oss 3. Building for databricks can be done by overriding the variable

SemyonSinchenko · 2025-06-06T07:45:38Z

@james-willis @Kimahriman @rjurney @dmatrix

Hello! After some discussions with engineers from Databricks and based on your comments I did the following updates:

I removed databricksConnect as a separate artifact
I changed the proto shading pattern from literal to variable
Anyone who wants a binary compatibility with Databricks Runtimes can build GraphFrames Connect with a command ./build/sbt -Dvendor.name=dbx connect/assembly

Please, take a look on the PR when you have time.
P.S. As I already mentioned, I'm not an experienced person in the topic of publishing anything to the sonatype central and this is my first experience with it. Please keep this in mind during the review. Thanks!

cc: @WeichenXu123

james-willis · 2025-06-06T19:40:13Z

I looked into what sedona is doing here and it seems that it shades in its own version of dependencies when databricks has some kind of conflict:

https://github.com/apache/sedona/blob/e8c0d5af41670757c6d319d9c0d393b0154ec929/common/pom.xml#L143

james-willis

will making users compile from source for dbx impede adoption?

SemyonSinchenko · 2025-06-07T05:00:15Z

will making users compile from source for dbx impede adoption?

Running Spark Connect plugins on Databricks is not an easy task anyway. I think we should be fine. Spark 4.0 should resolve this.

SemyonSinchenko added 2 commits May 25, 2025 13:32

WIP

354414c

WIP

71040e0

SemyonSinchenko self-assigned this May 27, 2025

SemyonSinchenko added the scala label May 27, 2025

Bump scalaVersion

11518d9

fix spark-connect tests

f0ada5c

SemyonSinchenko requested a review from rjurney May 27, 2025 12:36

SemyonSinchenko added 3 commits May 27, 2025 15:55

try to fix connect ci

3a26d98

Merge remote-tracking branch 'graphframes/master' into 510-scala-publ…

be3b831

…ish-ci

fix spark-connect tests && simplify them

9df8196

james-willis requested changes May 27, 2025

View reviewed changes

build.sbt Outdated Show resolved Hide resolved

fix typo

403b9ed

SemyonSinchenko requested a review from james-willis May 28, 2025 07:59

james-willis approved these changes May 28, 2025

View reviewed changes

rjurney approved these changes May 29, 2025

View reviewed changes

Updates

d831ebd

1. Dropped a databricksConnect subproject 2. Shading rule is a variable from now with a default eq to oss 3. Building for databricks can be done by overriding the variable

SemyonSinchenko requested review from james-willis and rjurney June 6, 2025 07:45

james-willis approved these changes Jun 6, 2025

View reviewed changes

SemyonSinchenko merged commit 06ee372 into graphframes:master Jun 9, 2025
5 checks passed

SemyonSinchenko mentioned this pull request Jun 9, 2025

Plans for a new release?? #599

Closed

SemyonSinchenko deleted the 510-scala-publish-ci branch July 19, 2025 11:56

Conversation

SemyonSinchenko commented May 27, 2025

What changes were proposed in this pull request?

Why are the changes needed?

Uh oh!

codecov-commenter commented May 27, 2025

Codecov Report

Uh oh!

SemyonSinchenko commented May 27, 2025

Uh oh!

SemyonSinchenko commented May 27, 2025

Uh oh!

james-willis commented May 27, 2025

Uh oh!

SemyonSinchenko commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

james-willis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SemyonSinchenko commented May 28, 2025

Uh oh!

MrPowers commented May 29, 2025

Uh oh!

rjurney left a comment

Choose a reason for hiding this comment

Uh oh!

SemyonSinchenko commented May 30, 2025

Uh oh!

WeichenXu123 commented May 30, 2025

Uh oh!

Kimahriman commented Jun 3, 2025

Uh oh!

SemyonSinchenko commented Jun 3, 2025

Uh oh!

SemyonSinchenko commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

james-willis commented Jun 6, 2025

Uh oh!

james-willis left a comment

Choose a reason for hiding this comment

Uh oh!

SemyonSinchenko commented Jun 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

SemyonSinchenko commented May 27, 2025 •

edited

Loading

SemyonSinchenko commented Jun 6, 2025 •

edited

Loading