Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@turboFei
Copy link
Member

@turboFei turboFei commented Jan 24, 2026

Why are the changes needed?

I saw that, a pod pending for 3 hours.

/var/log/hadoop/kyuubi/audit/k8s-audit.log.2026-01-23-08.gz::2026-01-23 08:09:31.356 INFO [-935552520-pool-3-thread-2861] org.apache.kyuubi.engine.KubernetesApplicationAuditLogger: eventType=UPDATE	label=cb370310-9c4c-4982-817e-85f4f6c031a9	context=kube-apiserver-a-spark	namespace=dls-prod	pod=kyuubi-hadp-adi-hadp-w-md-tbl-location-w-muso-0-ae7bd346-20260123-stm-cb370310-9c4c-4982-817e-85f4f6c031a9-driver	podState=Pending	containers=[]	appId=spark-f67cc33544654d828d36cc9553404a3e	appName=hadp-adi-hadp-w-md-t-818abc6d23397c33859c76dff9aa82e7244a92ba	appState=PENDING	appError=''
/var/log/hadoop/kyuubi/audit/k8s-audit.log.2026-01-23-11.gz::2026-01-23 11:09:50.121 INFO [-935552520-pool-3-thread-2902] org.apache.kyuubi.engine.KubernetesApplicationAuditLogger: eventType=UPDATE	label=cb370310-9c4c-4982-817e-85f4f6c031a9	context=kube-apiserver-a-spark	namespace=dls-prod	pod=kyuubi-hadp-adi-hadp-w-md-tbl-location-w-muso-0-ae7bd346-20260123-stm-cb370310-9c4c-4982-817e-85f4f6c031a9-driver	podState=Pending	containers=[]	appId=spark-f67cc33544654d828d36cc9553404a3e	appName=hadp-adi-hadp-w-md-t-818abc6d23397c33859c76dff9aa82e7244a92ba	appState=PENDING	appError=''

However, the audit log did not record that why the pod pending for so long time.

So, I think we should check the pod state if pod is pending.

Also, for yarn app, it would also return the app pending reason for example waiting for am allocation.

How was this patch tested?

Code review.

Was this patch authored or co-authored using generative AI tooling?

No.

@codecov-commenter
Copy link

codecov-commenter commented Jan 24, 2026

Codecov Report

❌ Patch coverage is 0% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 0.00%. Comparing base (2e79aa2) to head (1faff5e).
⚠️ Report is 26 commits behind head on master.

Files with missing lines Patch % Lines
...kyuubi/engine/KubernetesApplicationOperation.scala 0.00% 4 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##           master   #7313   +/-   ##
======================================
  Coverage    0.00%   0.00%           
======================================
  Files         697     698    +1     
  Lines       43587   43656   +69     
  Branches     5893    5896    +3     
======================================
- Misses      43587   43656   +69     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

if (ApplicationState.isFailed(applicationState, supportPersistedAppState = true)) {
if (ApplicationState.isFailed(
applicationState,
supportPersistedAppState = true) || ApplicationState.PENDING == applicationState) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to make sure, this won't has side-effect on control flow? since the method getApplicationStateAndErrorFromPod is not only called by KubernetesApplicationAuditLogger, indirectly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, no side-effect on control flow

Copy link
Member

@pan3793 pan3793 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, if get a positive response from the query.

@turboFei turboFei added this to the v1.10.4 milestone Jan 27, 2026
@turboFei turboFei self-assigned this Jan 27, 2026
@cxzl25 cxzl25 requested a review from Copilot January 28, 2026 11:35
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request enhances Kubernetes application audit logging to include status information for pods in PENDING state. Previously, only failed applications had their detailed status audited, but pods stuck in pending state can also indicate issues that need diagnosis. The change enables the audit logger to capture PodStatus and ContainerStatus information when a Kubernetes pod is pending, similar to how YARN already captures diagnostics for all states including PENDING.

Changes:

  • Modified the error collection condition in getApplicationStateAndErrorFromPod to include PENDING state alongside failed states, enabling status auditing for pending pods

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Map("Pod" -> podName, "PodStatus" -> pod.getStatus)
}
Some(JsonUtils.toPrettyJson(errorMap.asJava))
Some(JsonUtils.toJson(errorMap.asJava))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for audit purpose, toPrettyJson(multi-lines) is not suitable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants