-
Notifications
You must be signed in to change notification settings - Fork 1.1k
conmon: add support to restore a container #1427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
Hi @adrianreber. Thanks for your PR. I'm waiting for a openshift or kubernetes-incubator member to verify that this patch is reasonable to test. If it is, they should reply with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/ok-to-test |
|
@adrianreber you need to fill out the cla/Linuxfoundation stuff in order to contribute. |
|
@adrianreber BTW Thanks for the PR. |
conmon/conmon.c
Outdated
| { "cid", 'c', 0, G_OPTION_ARG_STRING, &opt_cid, "Container ID", NULL }, | ||
| { "cuuid", 'u', 0, G_OPTION_ARG_STRING, &opt_cuuid, "Container UUID", NULL }, | ||
| { "runtime", 'r', 0, G_OPTION_ARG_STRING, &opt_runtime_path, "Runtime path", NULL }, | ||
| { "restore", 0, 0, G_OPTION_ARG_NONE, &opt_restore, "Restore a container from a checkpoint", NULL }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to have an argument for the checkpoint to restore from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The checkpoint to restore from is defined by the container ID and the bundle.
|
@rhatdan my Linux Foundation ID does not use my Red Hat email address. Who do I need to contact to get my Linux Foundation ID added to Red Hat's organization? Or do I need a new account? |
|
can you add that email to your github (you can add more than one), or just sign the CLA with the other email too? |
|
I created a new account and now I am authorized to contribute code to this project. |
|
/test all |
|
/retest |
|
Further testing with podman on my side has shown that shown that @TomSweeneyRedHat was right that an explicit definition of the checkpoint directory makes sense. Especially when looking at further enhancements like pre-copy or post-copy container migration using multiple checkpoints. I need to update this PR. Please do not merge. |
runc supports checkpointing and restoring containers with the help of CRIU. To checkpoint a container from podman it is enough to just call runc to checkpoint the container. To restore a container with podman the resulting container should again be under the control of conmon. This extends conmon to be able to also restore a container. Signed-off-by: Adrian Reber <[email protected]>
|
These conmon changes in this PR are needed to support chackpointing and restoring in podman: containers/podman#469 |
|
One question I have is why does this need to go through conmon? Can't podman call runc directly? |
|
@mrunalp: Initially I called runc directly from podman, but the resulting container is then not running under the control of conmon. A newly started container, however, is running under the control of conmon. I do not know the reasons why the containers are running under conmon, but I tried to replicate the state of a newly created container with a restored container. |
|
@adrianreber We need Given this I definitely agree with the decision to do |
|
@adrianreber @mheon okay, sounds good. |
|
/test all |
| * '--work-path' is the directory CRIU will run in and | ||
| * also place its log files. | ||
| */ | ||
| add_argv(runtime_argv, "--detach", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will adding --detach prevent us from attaching to the container once it has been restored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, those are different attaches. The podman attach is using the attach socket to talk to the container and that already works. I already tried it with registry.fedoraproject.org/f26/httpd and after a restore I can see the request to the httpd server being logged on podman attach -l
The runc restore --detach is that same detach as runc run --detach which will immediately return to the shell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack, just wanted to make sure
|
LGTM |
|
All green, merging. |
runc supports checkpointing and restoring containers with the help of
CRIU. To checkpoint a container from podman it is enough to just call
runc to checkpoint the container. To restore a container with podman the
resulting container should again be under the control of conmon.
This extends conmon to be able to also restore a container.
Signed-off-by: Adrian Reber [email protected]