[BUG] Spark job with 8.0 maven jar fails with error commons-logging:commons-logging download failed.

### Willingness to contribute

Yes. I can contribute a fix for this bug independently.

### Feathr version

0.8.0

### System information

- **OS Platform and Distribution (e.g., Linux Ubuntu 20.0)**:
- **Python version**: 
- **Spark version, if reporting runtime issue**: Databricks 10.4 LTS (includes Apache Spark 3.2.1, Scala 2.12) 


### Describe the problem

Running nyc driver sample notebook with 0.8.0 maven jar, spark job will fails DRIVER_LIBRARY_INSTALLATION_FAILURE error. 

Note: this error only happens on Azure Databricks. Using maven jar on Synpase or local pyspark is working fine. 

This failure is due to Databricks runtime pre-built package has conflict with elastic search dependencies. 
To get it work, users will need to exclude packages in Databricks like below:
**commons-logging:commons-logging,org.slf4j:slf4j-api,com.google.protobuf:protobuf-java,javax.xml.bind:jaxb-api**
![image](https://user-images.githubusercontent.com/26561033/196220832-d3a96c1a-0e2f-45e5-9cd8-705a1ebe0c15.png)


### Workaround
In the YAML config, add a line for `feathr_runtime_location`, this will make spark cluster use runtime jars from Azure Storage, see the latest line in following example

```
spark_config:
  # choice for spark runtime. Currently support: azure_synapse, databricks
  # The `databricks` configs will be ignored if `azure_synapse` is set and vice versa.
  spark_cluster: "azure_synapse"
  # configure number of parts for the spark output for feature generation job
  spark_result_output_parts: "1"

  databricks:
    # workspace instance
    workspace_instance_url: 'https://adb-6885802458123232.12.azuredatabricks.net/'
    # config string including run time information, spark version, machine size, etc.
    # the config follows the format in the databricks documentation: https://docs.microsoft.com/en-us/azure/databricks/dev-tools/api/2.0/jobs#--request-structure-6
    # The fields marked as "FEATHR_FILL_IN" will be managed by Feathr. Other parameters can be customizable. For example, you can customize the node type, spark version, number of workers, instance pools, timeout, etc.
    config_template: '{"run_name":"FEATHR_FILL_IN","new_cluster":{"spark_version":"9.1.x-scala2.12","node_type_id":"Standard_D3_v2","num_workers":1,"spark_conf":{"FEATHR_FILL_IN":"FEATHR_FILL_IN"}},"libraries":[{"jar":"FEATHR_FILL_IN"}],"spark_jar_task":{"main_class_name":"FEATHR_FILL_IN","parameters":["FEATHR_FILL_IN"]}}'
    # workspace dir for storing all the required configuration files and the jar resources. All the feature definitions will be uploaded here
    work_dir: "dbfs:/feathr_getting_started"
    # This is the location of the runtime jar for Spark job submission. If you have compiled the runtime yourself, you need to specify this location.
    # Or use https://azurefeathrstorage.blob.core.windows.net/public/feathr-assembly-LATEST.jar so you don't have to compile the runtime yourself
    # Local path, path starting with `http(s)://` or `dbfs://` are supported. If not specified, the latest jar from Maven would be used
    feathr_runtime_location: "https://azurefeathrstorage.blob.core.windows.net/public/feathr-assembly-LATEST.jar"
```

### Tracking information
```
2022-09-26 00:10:42.789 | ERROR    | feathr.spark_provider._databricks_submission:wait_for_completion:210 - Feathr job has failed. Please visit this page to view error message: https://adb-1996253548709298.18.azuredatabricks.net/?o=1996253548709298#job/464256084943565/run/253311
2022-09-26 00:10:42.789 | ERROR    | feathr.spark_provider._databricks_submission:wait_for_completion:212 - Error Code: Run result unavailable: job failed with error message
 Library installation failed for library due to user error for maven {
  coordinates: "com.linkedin.feathr:feathr_2.12:0.8.0"
}
. Error messages:
Library installation attempted on the driver node of cluster 0926-000613-mqjvrxxu and failed. Please refer to the following error message to fix the library or contact Databricks support. Error Code: DRIVER_LIBRARY_INSTALLATION_FAILURE. Error Message: Library resolution failed. Cause: java.lang.RuntimeException: commons-logging:commons-logging download failed.
```
### Code to reproduce bug

_No response_

### What component(s) does this bug affect?

- [ ] `Python Client`: This is the client users use to interact with most of our API. Mostly written in Python.
- [X] `Computation Engine`: The computation engine that execute the actual feature join and generation work. Mostly in Scala and Spark.
- [ ] `Feature Registry API`: The frontend API layer supports SQL, Purview(Atlas) as storage. The API layer is in Python(FAST API)
- [ ] `Feature Registry Web UI`: The Web UI for feature registry. Written in React

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Spark job with 8.0 maven jar fails with error commons-logging:commons-logging download failed. #705

Willingness to contribute

Feathr version

System information

Describe the problem

Workaround

Tracking information

Code to reproduce bug

What component(s) does this bug affect?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Spark job with 8.0 maven jar fails with error commons-logging:commons-logging download failed. #705

Description

Willingness to contribute

Feathr version

System information

Describe the problem

Workaround

Tracking information

Code to reproduce bug

What component(s) does this bug affect?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions