Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[BUG] Spark job with 8.0 maven jar fails with error commons-logging:commons-logging download failed. #705

@blrchen

Description

@blrchen

Willingness to contribute

Yes. I can contribute a fix for this bug independently.

Feathr version

0.8.0

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 20.0):
  • Python version:
  • Spark version, if reporting runtime issue: Databricks 10.4 LTS (includes Apache Spark 3.2.1, Scala 2.12)

Describe the problem

Running nyc driver sample notebook with 0.8.0 maven jar, spark job will fails DRIVER_LIBRARY_INSTALLATION_FAILURE error.

Note: this error only happens on Azure Databricks. Using maven jar on Synpase or local pyspark is working fine.

This failure is due to Databricks runtime pre-built package has conflict with elastic search dependencies.
To get it work, users will need to exclude packages in Databricks like below:
commons-logging:commons-logging,org.slf4j:slf4j-api,com.google.protobuf:protobuf-java,javax.xml.bind:jaxb-api
image

Workaround

In the YAML config, add a line for feathr_runtime_location, this will make spark cluster use runtime jars from Azure Storage, see the latest line in following example

spark_config:
  # choice for spark runtime. Currently support: azure_synapse, databricks
  # The `databricks` configs will be ignored if `azure_synapse` is set and vice versa.
  spark_cluster: "azure_synapse"
  # configure number of parts for the spark output for feature generation job
  spark_result_output_parts: "1"

  databricks:
    # workspace instance
    workspace_instance_url: 'https://adb-6885802458123232.12.azuredatabricks.net/'
    # config string including run time information, spark version, machine size, etc.
    # the config follows the format in the databricks documentation: https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/2.0/jobs#--request-structure-6
    # The fields marked as "FEATHR_FILL_IN" will be managed by Feathr. Other parameters can be customizable. For example, you can customize the node type, spark version, number of workers, instance pools, timeout, etc.
    config_template: '{"run_name":"FEATHR_FILL_IN","new_cluster":{"spark_version":"9.1.x-scala2.12","node_type_id":"Standard_D3_v2","num_workers":1,"spark_conf":{"FEATHR_FILL_IN":"FEATHR_FILL_IN"}},"libraries":[{"jar":"FEATHR_FILL_IN"}],"spark_jar_task":{"main_class_name":"FEATHR_FILL_IN","parameters":["FEATHR_FILL_IN"]}}'
    # workspace dir for storing all the required configuration files and the jar resources. All the feature definitions will be uploaded here
    work_dir: "dbfs:/feathr_getting_started"
    # This is the location of the runtime jar for Spark job submission. If you have compiled the runtime yourself, you need to specify this location.
    # Or use https://azurefeathrstorage.blob.core.windows.net/public/feathr-assembly-LATEST.jar so you don't have to compile the runtime yourself
    # Local path, path starting with `http(s)://` or `dbfs://` are supported. If not specified, the latest jar from Maven would be used
    feathr_runtime_location: "https://azurefeathrstorage.blob.core.windows.net/public/feathr-assembly-LATEST.jar"

Tracking information

2022-09-26 00:10:42.789 | ERROR    | feathr.spark_provider._databricks_submission:wait_for_completion:210 - Feathr job has failed. Please visit this page to view error message: https://adb-1996253548709298.18.azuredatabricks.net/?o=1996253548709298#job/464256084943565/run/253311
2022-09-26 00:10:42.789 | ERROR    | feathr.spark_provider._databricks_submission:wait_for_completion:212 - Error Code: Run result unavailable: job failed with error message
 Library installation failed for library due to user error for maven {
  coordinates: "com.linkedin.feathr:feathr_2.12:0.8.0"
}
. Error messages:
Library installation attempted on the driver node of cluster 0926-000613-mqjvrxxu and failed. Please refer to the following error message to fix the library or contact Databricks support. Error Code: DRIVER_LIBRARY_INSTALLATION_FAILURE. Error Message: Library resolution failed. Cause: java.lang.RuntimeException: commons-logging:commons-logging download failed.

Code to reproduce bug

No response

What component(s) does this bug affect?

  • Python Client: This is the client users use to interact with most of our API. Mostly written in Python.
  • Computation Engine: The computation engine that execute the actual feature join and generation work. Mostly in Scala and Spark.
  • Feature Registry API: The frontend API layer supports SQL, Purview(Atlas) as storage. The API layer is in Python(FAST API)
  • Feature Registry Web UI: The Web UI for feature registry. Written in React

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions