Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@haircommander
Copy link
Member

@haircommander haircommander commented Jun 11, 2020

What type of PR is this?

/kind bug

What this PR does / why we need it:

this PR does two things:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?


@openshift-ci-robot openshift-ci-robot added the dco-signoff: yes Indicates the PR's author has DCO signed all their commits. label Jun 11, 2020
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 11, 2020
Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one nit, otherwise LGTM

Comment on lines 350 to 351
pid := c.state.Pid
process, err := os.FindProcess(pid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pid := c.state.Pid
process, err := os.FindProcess(pid)
process, err := os.FindProcess(c.state.Pid)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the function does not hold the lock, I am worried about the pid changing from underneath it. saving the value causes this function to run atomically

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though, I see now we don't even use this variable anywhere else. taking the change as suggested...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: haircommander, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [haircommander,saschagrunert]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

1 similar comment
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: haircommander, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [haircommander,saschagrunert]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

we have seen cases where $runtime state calls fail supriously, but succeed later
this is not great, though, we shouldn't incorrectly label pods if this happens.

We now retry state calls up to three times if we determine the container is still running (by calling kill on its pid)

Signed-off-by: Peter Hunt <[email protected]>
having exec sync update state each time is a bit excessive.
In addition to exec'ing extra, it causes potential for runc state to flake, causing the container to go down.
instead, we should just check if the pid is running, and proceed if so

Signed-off-by: Peter Hunt <[email protected]>
@codecov
Copy link

codecov bot commented Jun 11, 2020

Codecov Report

Merging #3867 into master will decrease coverage by 0.02%.
The diff coverage is 20.00%.

@@            Coverage Diff             @@
##           master    #3867      +/-   ##
==========================================
- Coverage   40.54%   40.51%   -0.03%     
==========================================
  Files         109      109              
  Lines        8798     8819      +21     
==========================================
+ Hits         3567     3573       +6     
- Misses       4913     4927      +14     
- Partials      318      319       +1     

Copy link
Member

@mrunalp mrunalp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also store the process start time in container state and use that for matching the process to rule
out any pid reuse scenarios.

@haircommander
Copy link
Member Author

We should also store the process start time in container state and use that for matching the process to rule
out any pid reuse scenarios.

I'd like to do that as a follow up, to use the implementation for all accesses of c.state.Pid

@haircommander
Copy link
Member Author

/retest

@haircommander
Copy link
Member Author

/hold

the approach is changing

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 11, 2020
@haircommander
Copy link
Member Author

closing in favor of #3868

@haircommander haircommander deleted the log-updates-2 branch September 27, 2021 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants