Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@cynepco3hahue
Copy link

What type of PR is this?

/kind bug

What this PR does / why we need it:

It is possible that the kernel will rebuild sched_domain related files and
because of it enabling or disabling CPU load balancing for container CPUs
will fail with different file errors:

  1. lstat /proc/sys/kernel/sched_domain/cpu22/domain1/flags: no such file or directory
  2. readdirent /proc/sys/kernel/sched_domain/cpu66/domain0: no such file or directory

Add retry logic around setting CPU load balancing values to reduce the possibility of such errors.

NONE

Signed-off-by: Artyom Lukianov [email protected]

@openshift-ci openshift-ci bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/bug Categorizes issue or PR as related to a bug. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 6, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 6, 2022

Hi @cynepco3hahue. Thanks for your PR.

I'm waiting for a cri-o member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested review from klihub and wgahnagl April 6, 2022 13:31
@cynepco3hahue cynepco3hahue force-pushed the retry_setting_cpu_load_balancing branch from c7c874d to 5502351 Compare April 6, 2022 13:32
@cynepco3hahue
Copy link
Author

@haircommander Can you please review it?

// TODO: re-visit once we will have some more acceptable cgroups hierarchy to disable CPU load balancing
// correctly via cgroups, see -https://bugzilla.redhat.com/show_bug.cgi?id=1946801
return wait.PollImmediate(time.Second, 5*time.Second, func() (bool, error) {
if err := setCPUSLoadBalancing(c, enable, schedDomainDir); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we only do this if os.IsNotExist(err)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it makes sense

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

It possible that the kernel will rebuild sched_domain related files and
because of it enabling or disabling CPU load balancing for container CPUs
will fail with different file errors:

1. lstat /proc/sys/kernel/sched_domain/cpu22/domain1/flags: no such file or directory
2. readdirent /proc/sys/kernel/sched_domain/cpu66/domain0: no such file or directory

Add retry logic around setting CPU load balancing values to reduce possibility of such errors.

Signed-off-by: Artyom Lukianov <[email protected]>
@cynepco3hahue cynepco3hahue force-pushed the retry_setting_cpu_load_balancing branch from 5502351 to 1098cc9 Compare April 6, 2022 16:28
@haircommander
Copy link
Member

/approve

LGTM
@cri-o/cri-o-maintainers PTAL

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 6, 2022
@cynepco3hahue
Copy link
Author

@haircommander Can you please add ok-to-test?

@haircommander
Copy link
Member

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 7, 2022
@cynepco3hahue
Copy link
Author

/retest

1 similar comment
@cynepco3hahue
Copy link
Author

/retest

@cynepco3hahue
Copy link
Author

@haircommander Do you know whom can I ask for an additional review?

Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good one!
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 12, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Apr 12, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cynepco3hahue, haircommander, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [haircommander,saschagrunert]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cynepco3hahue
Copy link
Author

Good one!
/lgtm

Thanks!

@saschagrunert
Copy link
Member

/test e2e-gcp

@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 9ed9393 into cri-o:main Apr 12, 2022
@kolyshkin
Copy link
Collaborator

/cherry-pick release-1.21

@openshift-cherrypick-robot

@kolyshkin: new pull request created: #5919

Details

In response to this:

/cherry-pick release-1.21

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kolyshkin
Copy link
Collaborator

/cherry-pick release-1.22
/cherry-pick release-1.23

@openshift-cherrypick-robot

@kolyshkin: new pull request created: #5920

Details

In response to this:

/cherry-pick release-1.22
/cherry-pick release-1.23

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kolyshkin
Copy link
Collaborator

/cherry-pick release-1.23

@openshift-cherrypick-robot

@kolyshkin: new pull request created: #5921

Details

In response to this:

/cherry-pick release-1.23

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants