[FEATURE] Add shutdown watchdog to forcefully terminate the spark engine and prevent resource leaks.

### Code of Conduct

- [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)


### Search before asking

- [x] I have searched in the [issues](https://github.com/apache/kyuubi/issues?q=is%3Aissue) and found no similar issues.


### Describe the feature

This feature introducing a mechanism that, when the Spark engine decides to shut down, starts a shutdown watchdog. If the timeout is reached, it will print the stack traces of all currently alive threads and then forcibly terminate the process.

### Motivation

Currently, there are scenarios where the engine should exit but fails to do so due to various reasons, and these scenarios cannot be exhaustively enumerated. For example, see this discussion: https://github.com/apache/kyuubi/discussions/6992#discussioncomment-13775648, and these issues: https://github.com/apache/kyuubi/issues/4280, https://github.com/apache/kyuubi/issues/7019.

Similarly, we encountered this issue in production. For example, in the following log, after SparkContext stopped, the entire process should have executed the shutdown hook and exited. However, due to an abnormal Ranger thread, the process was blocked for over ten days until it eventually exhausted the ECS resources and was finally discovered.

<img width="2844" height="1112" alt="Image" src="https://github.com/user-attachments/assets/f142f869-850c-466c-9eb2-886fcba7416e" />

### Describe the solution

I want to add a daemon watchdog thread that starts with a timeout when the stop() method is called. If the process can shut down normally, this daemon thread will be interrupted and the entire process will exit gracefully. If the timeout is reached and the process is still alive, it means some threads are blocking the shutdown; I will then print all active threads in the current process and force quit.

### Additional context

_No response_

### Are you willing to submit PR?

- [x] Yes. I would be willing to submit a PR with guidance from the Kyuubi community to improve.
- [ ] No. I cannot submit a PR at this time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Add shutdown watchdog to forcefully terminate the spark engine and prevent resource leaks. #7149

Code of Conduct

Search before asking

Describe the feature

Motivation

Describe the solution

Additional context

Are you willing to submit PR?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Add shutdown watchdog to forcefully terminate the spark engine and prevent resource leaks. #7149

Description

Code of Conduct

Search before asking

Describe the feature

Motivation

Describe the solution

Additional context

Are you willing to submit PR?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions