-
Notifications
You must be signed in to change notification settings - Fork 436
BugFix: scale-up workload stuck pending not triggering preemption. #6973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BugFix: scale-up workload stuck pending not triggering preemption. #6973
Conversation
…bsViaWorkloadSlices and PartialAdmission These two features are not designed to work together. While admission webhooks already prevent `PartialAdmission` from being combined with `ElasticJobsViaWorkloadSlices`, we add this safety check in the scheduler as well to guard against misconfiguration. This ensures that both features cannot be enabled at the same time, avoiding unexpected scheduling behavior. Signed-off-by: Illya Chekrygin <[email protected]>
…estFor Ensures that when a workload slice is replaced, its resource requests are properly reflected in `TotalRequestFor` during flavor assignment and subsequent preemption targets list generation. Signed-off-by: Illya Chekrygin <[email protected]>
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
/test pull-kueue-test-integration-baseline-main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't check all, but left a few comments
} | ||
cfg = fwk.Init() | ||
ctx, k8sClient = fwk.SetupClient(cfg) | ||
gomega.Expect(utilfeature.DefaultMutableFeatureGate.SetFromMap(map[string]bool{string(features.ElasticJobsViaWorkloadSlices): true})).Should(gomega.Succeed()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid enabling such Alpha feature at global level. We should set the FG only in the affected case.
You can learn HOW in
features.SetFeatureGateDuringTest(ginkgo.GinkgoTB(), features.MultiKueueBatchJobWithManagedBy, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah... yes. Good catch! For some reason I thought I need "feature" activation at the manager level.
Reverted.
if a.replaceWorkloadSlice != nil { | ||
ps = *ps.ScaledTo(ps.Count - a.replaceWorkloadSlice.TotalRequests[i].Count) | ||
} else { | ||
aps := a.PodSets[i] | ||
if aps.Count != ps.Count { | ||
ps = *ps.ScaledTo(aps.Count) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of doing two independent branches could we just decrement the value we are scaling?
Something like:
newCount := a.PodSets[i].Count
if a.replaceWorkloadSlice != nil {
newCount -= a.replaceWorkloadSlice.TotalRequests[i].Count
}
ps = *ps.ScaledTo(newCount)
wdyt?
util.ExpectObjectToBeDeleted(ctx, k8sClient, cq, true) | ||
}) | ||
|
||
ginkgo.It("Should preempt on-create Workloads with lower priority when there is not enough quota", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this "It" introduce additional coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s updated (PTAL), with feature activation moved into the test-case scope.
This test is dedicated to validating that general workload preemption works — specifically, that an elastic workload can preempt another job outside of the scale-up context. In other words, it serves more as a sanity check.
Happy to remove it if we think it’s redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the motivation. OTOH we already have a test exercising priority based preemption (in the simple case) for ElasticJobs here:
kueue/test/integration/singlecluster/controller/jobs/job/job_controller_test.go
Lines 3491 to 3498 in 785114f
highPriorityJob := testingjob.MakeJob("high", ns.Name). | |
SetAnnotation(workloadslicing.EnabledAnnotationKey, workloadslicing.EnabledAnnotationValue). | |
Queue(kueue.LocalQueueName(localQueue.Name)). | |
Request(corev1.ResourceCPU, "1000m"). | |
Parallelism(3). | |
Completions(3). | |
WorkloadPriorityClass(highPriorityClass.Name). | |
Obj() |
I know the pre-existing test is a bit more involving, but I'm unclear adding the new small test makes it easier to understand.
I would focus only on the issue at hand and drop the extra test. I would be happy to keep it if we haven't had the other preexisting test already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
pkg/scheduler/scheduler.go
Outdated
} | ||
|
||
if features.Enabled(features.PartialAdmission) && wl.CanBePartiallyAdmitted() { | ||
if features.Enabled(features.PartialAdmission) && wl.CanBePartiallyAdmitted() && replaceableWorkloadSlice == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have some test for this? IIUC the new integration tests (or the reported issue) also show up with the "PartialAdmission" disabled? So, I think this code change may not be exercised by the integration tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, IIUC this code change is fixing a different issue. Also, I'm wondering if we shouldn't just disable PartialAdmission for ElasticJobs at the validation level.
Let me know if I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, IIUC this code change is fixing a different issue. Also, I'm wondering if we shouldn't just disable PartialAdmission for ElasticJobs at the validation level.
We already enforce that at the validation level. The purpose of this change is to add a sanity check, ensuring that PartialAdmission (PA) is never combined with ElasticJobs (EJ). It also serves as an explicit reminder at the scheduler level that PA is not compatible with EJ.
Please let me know if this addition is a "block", I am happy to revert this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's cleanup the code. We can rely on validation, this is the common practice in k8s, to avoid checking against invariants which are satisfied by validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
…case scoped feature activation. Signed-off-by: Illya Chekrygin <[email protected]>
Signed-off-by: Illya Chekrygin <[email protected]>
Signed-off-by: Illya Chekrygin <[email protected]>
Signed-off-by: Illya Chekrygin <[email protected]>
Thank you 👍 |
@mimowo: once the present PR merges, I will cherry-pick it on top of In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
LGTM label has been added. Git tree hash: 52172b576dc766e81a53b27176d90af8834b2e91
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ichekrygin, mimowo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@mimowo: #6973 failed to apply on top of branch "release-0.13":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@ichekrygin can you maybe prepare the cherry-pick? The ./hack/cherry_pick_pull.sh script is pretty useful for that. |
…ubernetes-sigs#6973) * Scheduler: add safety check to ensure mutual exclusivity of ElasticJobsViaWorkloadSlices and PartialAdmission These two features are not designed to work together. While admission webhooks already prevent `PartialAdmission` from being combined with `ElasticJobsViaWorkloadSlices`, we add this safety check in the scheduler as well to guard against misconfiguration. This ensures that both features cannot be enabled at the same time, avoiding unexpected scheduling behavior. Signed-off-by: Illya Chekrygin <[email protected]> * Fix: account for replaced workload slice in assigned flavor TotalRequestFor Ensures that when a workload slice is replaced, its resource requests are properly reflected in `TotalRequestFor` during flavor assignment and subsequent preemption targets list generation. Signed-off-by: Illya Chekrygin <[email protected]> * Revert feature enablement at the suite level, replacing it with test-case scoped feature activation. Signed-off-by: Illya Chekrygin <[email protected]> * Update TotalRequestsFor calculation per PR feedback. Signed-off-by: Illya Chekrygin <[email protected]> * Revert change in scheduler removing check for ElasticWorkload. Signed-off-by: Illya Chekrygin <[email protected]> * Remove "redundant" integration test. Signed-off-by: Illya Chekrygin <[email protected]> --------- Signed-off-by: Illya Chekrygin <[email protected]> (cherry picked from commit 19ebce0)
…6973) (#7013) * Scheduler: add safety check to ensure mutual exclusivity of ElasticJobsViaWorkloadSlices and PartialAdmission These two features are not designed to work together. While admission webhooks already prevent `PartialAdmission` from being combined with `ElasticJobsViaWorkloadSlices`, we add this safety check in the scheduler as well to guard against misconfiguration. This ensures that both features cannot be enabled at the same time, avoiding unexpected scheduling behavior. * Fix: account for replaced workload slice in assigned flavor TotalRequestFor Ensures that when a workload slice is replaced, its resource requests are properly reflected in `TotalRequestFor` during flavor assignment and subsequent preemption targets list generation. * Revert feature enablement at the suite level, replacing it with test-case scoped feature activation. * Update TotalRequestsFor calculation per PR feedback. * Revert change in scheduler removing check for ElasticWorkload. * Remove "redundant" integration test. --------- (cherry picked from commit 19ebce0) Signed-off-by: Illya Chekrygin <[email protected]>
/release-note-edit
|
What type of PR is this?
/kind bug
What this PR does / why we need it:
This PR fixes bug when scaled-up workload stuck pending not triggering preemption.
Which issue(s) this PR fixes:
Fixes #6969
Special notes for your reviewer:
Does this PR introduce a user-facing change?