Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@haircommander
Copy link
Member

@haircommander haircommander commented Jun 11, 2020

What type of PR is this?

/kind bug

What this PR does / why we need it:

As we found in the conmonmon saga, directly accessing pids on the host is a bit dangerous. Primarily, we risk believing a pid is the container process, when it is really a new process, because of pid wrap. While this is really unlikely to happen if you set it to the kernel pid_max (4 billion?), it still can happen, and can cause security problems (especially when using it for namespaces, when not managing namespace lifecycle)

Now, we drop every direct c.state.Pid access in favor of a helper function c.Pid() (and children). This allows us to do the proper verification (checking stime is the same for the pid we're trying to use as the container's pid originally was)

This PR also refactors how we use FindProcess(), as there was a bit of duplication, and direct accesses to c.state.Pid were littered in code associated with FindProcess()

This also carries the exec sync patch checking c.IsRunning() instead of updating container state, as a bonus.

TODO:

  • test restore case (I need to find the best way to persist the pid to disk)
  • add more tests
    - [ ] maybe add this pid handling stuff in a new package for better unit testing

But I'm throwing this up to see how CI likes it

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?


@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Jun 11, 2020
@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 11, 2020
@haircommander
Copy link
Member Author

/retest

// the unix start time of the process, as found by psgo
// this is used to track whether the pid we have stored
// is the same as the same pid number on the host
psStartTime string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about pidStartTime ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that's better. I was at a loss for a variable name :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

processStartTime is probably better than that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will have to be persisted in state.

Copy link
Member

@mrunalp mrunalp Jun 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking that we should probably separate the pid returned from runc state from the container start pid. runc state could change to zero once stopped. We can separately store the startTime, Pid as top level fields in crio state that never change once populated.

@haircommander
Copy link
Member Author

/retest

@codecov
Copy link

codecov bot commented Jun 17, 2020

Codecov Report

Merging #3868 into master will increase coverage by 0.29%.
The diff coverage is 64.70%.

@@            Coverage Diff             @@
##           master    #3868      +/-   ##
==========================================
+ Coverage   40.41%   40.71%   +0.29%     
==========================================
  Files         111      109       -2     
  Lines        8947     8970      +23     
==========================================
+ Hits         3616     3652      +36     
+ Misses       4999     4986      -13     
  Partials      332      332              

@haircommander haircommander force-pushed the check-pid branch 2 times, most recently from fc5fabe to 641d4fe Compare June 17, 2020 18:53
@haircommander haircommander changed the title WIP oci: check pid before using it oci: check pid before using it Jun 17, 2020
@openshift-ci-robot openshift-ci-robot removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jun 17, 2020
@haircommander
Copy link
Member Author

/retest

@haircommander
Copy link
Member Author

/retest

@haircommander
Copy link
Member Author

integration tests are a runc problem fixed in opencontainers/runc#2479, e2e test failures are legit

@haircommander haircommander force-pushed the check-pid branch 2 times, most recently from e9e10bf to eaeae76 Compare June 19, 2020 16:41
@haircommander
Copy link
Member Author

/retest

@haircommander
Copy link
Member Author

/retest

}

// getInitStartTime reads the kernel's /proc entry for stime for PID.
func getInitStartTime(pid int) (int, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word Init in the function name is misleading, since the function doesn't care if the pid is that of init or not.

// won't exist anyway.
pid, _ := ctr.pid() // nolint:errcheck
if pid > 0 {
netNsPath := fmt.Sprintf("/proc/%d/ns/net", pid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we return the pinned path here? We can get rid of the pid check then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple of reasons:

primarily: we don't have direct access to the Sandbox object, which owns the namespaces. We'd have to ask the server what the sandbox is, and then ask the sandbox what the namespace is.
secondarily, is it possible to have a private net namespace? if so, the sandbox level namespace won't be the correct one

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Atleast for the pinned ns case.

Copy link
Member

@mrunalp mrunalp Jul 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no private net namespaces in a pod. All the containers always share the pod network namespace.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, well we still have the access problem. I would replumb this as a follow up, if we'd like

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 30, 2020
having exec sync update state each time is a bit excessive.
In addition to exec'ing extra, it causes potential for runc state to flake, causing the container to go down.
instead, we should just check if the pid is running, and proceed if so

Signed-off-by: Peter Hunt <[email protected]>
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 30, 2020
in any case where we want to directly manipulate a pid on the host. this is unsafe, as we can encounter pid wrap.
for those cases, we need to check the pid is the one we want to access, whether it's creating a namespace path with /proc
or directly killing a container

also, add unit tests for this fix, as well as refactor a few of the oci tests

Signed-off-by: Peter Hunt <[email protected]>
@haircommander haircommander force-pushed the check-pid branch 2 times, most recently from 9c32a5e to ee7a9a0 Compare July 30, 2020 20:56
@haircommander
Copy link
Member Author

@kolyshkin pointed out that most of findprocess is now obsolete with the existence of verifyPid(). I have removed it and associated calls

now, verifyPid does all of the juicy bits of findprocess, without any of the overhead.

Further, it added a lot of code that was largely written as a compatibility layer for windows.

Since we have no intention in the near future to support windows anymore, let's drop it

Signed-off-by: Peter Hunt <[email protected]>
@haircommander
Copy link
Member Author

/retest

// is valid. If not, infraPid should be less than or equal to 0
if ns == nil || ns.Get() == nil {
if infraPid >= 0 {
if infraPid > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't match the new comment on 319. Did you mean to remove the = here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comment is meant to say "if it's not valid". is the wording unclear?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also had to read this part twice, and it is correct, including the comment, but maybe not super clear. Not sure how to improve it though :(

return false, nil
}

process, err := findprocess.FindProcess(pid)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not suggesting to drop process.Release(), but drop both find and release. We have already checked the process is there, I'm not sure if it makes sense to do it one more time.

// is valid. If not, infraPid should be less than or equal to 0
if ns == nil || ns.Get() == nil {
if infraPid >= 0 {
if infraPid > 0 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also had to read this part twice, and it is correct, including the comment, but maybe not super clear. Not sure how to improve it though :(


cState := c.State()
if !(cState.Status == oci.ContainerStateRunning || cState.Status == oci.ContainerStateCreated) {
if !c.IsAlive() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe

if _, err := c.verifyPid(); err != nil {

and remove the whole IsAlive function?

Or, use IsAlive in all the places where you're using the if like the one in this comment. Otherwise it's a duplicated functionality.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the nice thing about the different functions has to do with error handling. In the case of IsAlive, I believe it should be used in cases we presume the container to be alive, and it would be an error if it was not.

Conversely, verifyPid does not make an assertion of what the liveliness of the container should be

For instance, all uses in runtime_oci.go assume the container will eventually not be alive. in that case, we wouldn't want to print an error that the container wasn't alive

I think it is correct to error in server/container_execsync.go, as the client asked to exec in a container that is not running. Propagating the reason for this failure (pid wrap, container exited, we exec'ed too early somehow) is useful.

Do you have suggestions on how to balance these two use cases and error scenerios @kolyshkin

@haircommander
Copy link
Member Author

/retest

@mrunalp
Copy link
Member

mrunalp commented Aug 7, 2020

/test e2e-aws

@haircommander
Copy link
Member Author

/retest

@mrunalp
Copy link
Member

mrunalp commented Aug 10, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 10, 2020
@haircommander
Copy link
Member Author

/retest

@haircommander
Copy link
Member Author

/retest

2 similar comments
@haircommander
Copy link
Member Author

/retest

@haircommander
Copy link
Member Author

/retest

@openshift-merge-robot openshift-merge-robot merged commit 7184c0f into cri-o:master Aug 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. lgtm Indicates that a PR is ready to be merged. release-1.19

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants