Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@littlejawa
Copy link
Contributor

@littlejawa littlejawa commented Feb 2, 2023

What type of PR is this?

/kind bug

What this PR does / why we need it:

The integration test "ctr pod lifecycle with evented pleg enabled" is failing for kata containers.
This is because of how the container's exit is handled for kata.

For regular containers, the exit is detected by the server.exitsMonitor(), and it calls handleExit() for it, where the event CONTAINER_STOPPED_EVENT is generated.

For kata containers, the exit is detected by a wait function started when the container is started, and it doesn't send the event.

The event can be sent by calling the function server.generateCRIEvent(). But having a handle on the server in runtimeVM seems cumbersome.
Also I think runtimeVM containers could benefit from leveraging the handleExit() code in general - rather than handling the exit separately.

This is why I modified the code on the wait function for runtimeVM so that it just creates a file in the exitsPath, where exitsMonitor() is watching. This makes the watcher trigger the handleExit() function for kata containers too, and generates the event.

The second patch is fixing an issue that is triggered by the removal of the container: the code is updating the container status, but the container is already gone at that time, and this was triggering an error on the runtimeVM side, which prevented generating the CONTAINER_DELETED_EVENT. Ignoring the error on this specific situation seems to solve the issue.

Which issue(s) this PR fixes:

Fixes #6481

Special notes for your reviewer:

This PR stems from the discussion that happened on #6531. While the original fix on this PR was wrong, the discussion helps understand the requirements for evented PLEG.

Does this PR introduce a user-facing change?

none

@openshift-ci openshift-ci bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/bug Categorizes issue or PR as related to a bug. labels Feb 2, 2023
@openshift-ci openshift-ci bot requested review from QiWang19 and klihub February 2, 2023 15:19
@TomSweeneyRedHat
Copy link
Contributor

Changes look OK to me, but lint isn't buying it.

runtimeVM is monitoring the execution of the container on its own,
without using conmon. Because of that, when the container exits,
the processing is different than for regular containers.
This is causing some issues, like events not being generated on time.

This commit makes runtimeVM create a file under the "ContainerExitPath"
in the same way that conmon does, so that the Server.monitorExits() function
can pick it up and run the required processing for those containers too.

Signed-off-by: Julien Ropé <[email protected]>
Updating the container's status when it's already removed causes an error.
We can ignore this error safely when we find the container was terminated already.

Signed-off-by: Julien Ropé <[email protected]>
@codecov
Copy link

codecov bot commented Feb 3, 2023

Codecov Report

Merging #6603 (4cf3d37) into main (9de68f5) will decrease coverage by 0.02%.
The diff coverage is 0.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6603      +/-   ##
==========================================
- Coverage   44.58%   44.56%   -0.02%     
==========================================
  Files         128      128              
  Lines       14880    14887       +7     
==========================================
+ Hits         6634     6635       +1     
- Misses       7451     7457       +6     
  Partials      795      795              

@littlejawa
Copy link
Contributor Author

/test kata-containers

@sohankunkerkar
Copy link
Member

/retest

@littlejawa
Copy link
Contributor Author

/test kata-containers

Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/retest

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 7, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 7, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: littlejawa, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 7, 2023
@haircommander
Copy link
Member

/override ci/prow/e2e-aws-ovn
/override ci/prow/e2e-gcp-ovn

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 7, 2023

@haircommander: Overrode contexts on behalf of haircommander: ci/prow/e2e-aws-ovn, ci/prow/e2e-gcp-ovn

Details

In response to this:

/override ci/prow/e2e-aws-ovn
/override ci/prow/e2e-gcp-ovn

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@haircommander
Copy link
Member

/retest

@sohankunkerkar
Copy link
Member

/test ci-fedora-integration

@saschagrunert
Copy link
Member

/override ci/prow/e2e-gcp-ovn
/override ci/prow/e2e-aws-ovn

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 8, 2023

@saschagrunert: Overrode contexts on behalf of saschagrunert: ci/prow/e2e-aws-ovn, ci/prow/e2e-gcp-ovn

Details

In response to this:

/override ci/prow/e2e-gcp-ovn
/override ci/prow/e2e-aws-ovn

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-merge-robot openshift-merge-robot merged commit 3a691b1 into cri-o:main Feb 8, 2023
@littlejawa littlejawa deleted the fix_6481 branch March 27, 2023 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kata tests are broken

6 participants