-
Notifications
You must be signed in to change notification settings - Fork 243
Description
Willingness to contribute
Yes. I can contribute a fix for this bug independently.
Feathr version
0.8.0
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 20.0):
- Python version:
- Spark version, if reporting runtime issue: Databricks 10.4 LTS (includes Apache Spark 3.2.1, Scala 2.12)
Describe the problem
Running nyc driver sample notebook with 0.8.0 maven jar, spark job will fails DRIVER_LIBRARY_INSTALLATION_FAILURE error.
Note: this error only happens on Azure Databricks. Using maven jar on Synpase or local pyspark is working fine.
This failure is due to Databricks runtime pre-built package has conflict with elastic search dependencies.
To get it work, users will need to exclude packages in Databricks like below:
commons-logging:commons-logging,org.slf4j:slf4j-api,com.google.protobuf:protobuf-java,javax.xml.bind:jaxb-api

Workaround
In the YAML config, add a line for feathr_runtime_location, this will make spark cluster use runtime jars from Azure Storage, see the latest line in following example
spark_config:
# choice for spark runtime. Currently support: azure_synapse, databricks
# The `databricks` configs will be ignored if `azure_synapse` is set and vice versa.
spark_cluster: "azure_synapse"
# configure number of parts for the spark output for feature generation job
spark_result_output_parts: "1"
databricks:
# workspace instance
workspace_instance_url: 'https://adb-6885802458123232.12.azuredatabricks.net/'
# config string including run time information, spark version, machine size, etc.
# the config follows the format in the databricks documentation: https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/2.0/jobs#--request-structure-6
# The fields marked as "FEATHR_FILL_IN" will be managed by Feathr. Other parameters can be customizable. For example, you can customize the node type, spark version, number of workers, instance pools, timeout, etc.
config_template: '{"run_name":"FEATHR_FILL_IN","new_cluster":{"spark_version":"9.1.x-scala2.12","node_type_id":"Standard_D3_v2","num_workers":1,"spark_conf":{"FEATHR_FILL_IN":"FEATHR_FILL_IN"}},"libraries":[{"jar":"FEATHR_FILL_IN"}],"spark_jar_task":{"main_class_name":"FEATHR_FILL_IN","parameters":["FEATHR_FILL_IN"]}}'
# workspace dir for storing all the required configuration files and the jar resources. All the feature definitions will be uploaded here
work_dir: "dbfs:/feathr_getting_started"
# This is the location of the runtime jar for Spark job submission. If you have compiled the runtime yourself, you need to specify this location.
# Or use https://azurefeathrstorage.blob.core.windows.net/public/feathr-assembly-LATEST.jar so you don't have to compile the runtime yourself
# Local path, path starting with `http(s)://` or `dbfs://` are supported. If not specified, the latest jar from Maven would be used
feathr_runtime_location: "https://azurefeathrstorage.blob.core.windows.net/public/feathr-assembly-LATEST.jar"
Tracking information
2022-09-26 00:10:42.789 | ERROR | feathr.spark_provider._databricks_submission:wait_for_completion:210 - Feathr job has failed. Please visit this page to view error message: https://adb-1996253548709298.18.azuredatabricks.net/?o=1996253548709298#job/464256084943565/run/253311
2022-09-26 00:10:42.789 | ERROR | feathr.spark_provider._databricks_submission:wait_for_completion:212 - Error Code: Run result unavailable: job failed with error message
Library installation failed for library due to user error for maven {
coordinates: "com.linkedin.feathr:feathr_2.12:0.8.0"
}
. Error messages:
Library installation attempted on the driver node of cluster 0926-000613-mqjvrxxu and failed. Please refer to the following error message to fix the library or contact Databricks support. Error Code: DRIVER_LIBRARY_INSTALLATION_FAILURE. Error Message: Library resolution failed. Cause: java.lang.RuntimeException: commons-logging:commons-logging download failed.
Code to reproduce bug
No response
What component(s) does this bug affect?
-
Python Client: This is the client users use to interact with most of our API. Mostly written in Python. -
Computation Engine: The computation engine that execute the actual feature join and generation work. Mostly in Scala and Spark. -
Feature Registry API: The frontend API layer supports SQL, Purview(Atlas) as storage. The API layer is in Python(FAST API) -
Feature Registry Web UI: The Web UI for feature registry. Written in React