Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@haircommander
Copy link
Member

@haircommander haircommander commented Jul 20, 2020

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

there are cases where crio doesn't get the chance to sync before shutdown.
In these cases, container storage can be corrupted.
We need to protect against this case by wiping all of storage if we detect we didn't cleanly shutdown.

Add an option to specify a clean_shutdown_file that crio will create upon syncing at shutdown
Add an option to crio-wipe to clear all of storage if that file is not present
Add integration tests to verify

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

add clean_shutdown_file option to allow crio/crio wipe to verify crio had time to shutdown cleanly

@openshift-ci-robot openshift-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. labels Jul 20, 2020
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 20, 2020
cmd/crio/main.go Outdated
logrus.Fatal(err)
}
// Finally, we clear out the shutdown file
if err := os.Remove(config.CleanShutdownFile); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a follow-on we also want to report unclean shutdown as a prometheus metric.

server/server.go Outdated

syscall.Sync()

f, err := os.Create(s.config.CleanShutdownFile)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will have to sync this file :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getting this atomic is challenging. If we call Create, then immediately after call Sync(), are we guarenteed to have the storage to sync before CleanShutdownFile? If not, we could believe we shutdown cleanly, but actually still have a corrupted storage. In this case, I'd rather let the Create() call be sure to happen after storage is on disk.

If you'd rather, I can add another Sync() after the create, though.

Copy link
Member

@mrunalp mrunalp Jul 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do a separate fsync for this file and worst case if it fails then we end up wiping storage. The fsync for this file is to reduce the possibility of that happening. We could still get power cable yanked after the sync and before the fsync of this file and that's okay.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could just fsync the parent directory, as in the suggestion I've made above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters much to get it atomic, in unlikely case of a crash between the file creation and the fsync for the parent directory, the file won't be found on the next reboot and the storage is wiped out

@haircommander haircommander force-pushed the clean-shutdown-1.18 branch 2 times, most recently from d83f893 to 23f6d8e Compare July 20, 2020 23:24
Comment on lines 110 to 111
complete -c crio -n '__fish_crio_no_subcommand' -f -l selinux -d 'Enable selinux support (default: false)'
complete -c crio -n '__fish_crio_no_subcommand' -f -l selinux -d 'Enable selinux support (default: true)'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this causes CI issus.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

damn it keeps sneaking in there 😃

cmd/crio/wipe.go Outdated
return err
}
if len(crioContainers) != 0 {
logrus.Infof("wiping containers")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logrus.Infof("wiping containers")
logrus.Info("Wiping containers")

cmd/crio/main.go Outdated
// Finally, we clear out the shutdown file
if err := os.Remove(config.CleanShutdownFile); err != nil {
// not a fatal error, as it could have been cleaned up
logrus.Errorf(err.Error())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logrus.Errorf(err.Error())
logrus.Error(err)

# Location for CRI-O to lay down the clean shutdown file.
# It is used to check whether crio had time to sync before shutting down.
# If not, crio wipe will clear the storage directory.
clean_shutdown_file = "{{ .CleanShutdownFile }}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to be able to disable this feature if clean_shutdown_file = "" or commented out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added this

cmd/crio/main.go Outdated
logrus.Fatal(err)
}
// Finally, we clear out the shutdown file
if err := os.Remove(config.CleanShutdownFile); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's play safe here and add a:

	f, err := os.OpenFile(filepath.Dir(config.CleanShutdownFile), os.O_RDONLY, 0755)
	if err != nil {
		...
	}
	defer f.Close()
	
	if err = syscall.Fsync(int(f.Fd())); err != nil {
		...
	}

after the file is removed, so we are sure the parent directory is synced to disk

server/server.go Outdated

// first, make sure we sync all storage changes
syscall.Sync()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be safer to use Syncfs in addition to sync so we have some error code:

	f, err := os.OpenFile(store.GraphRoot(), os.O_RDONLY, 0755)
	if err != nil {
		...
	}
	defer f.Close()
	
	if err = unix.Syncfs(int(f.Fd())); err != nil {
		...
	}

so we are sure the file system holding the graphroot is correctly synced.

The cost should be minimal after the full sync and we have an error if something goes wrong

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for Fsync as it seems to be more of what we want

Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good work, I left some comments for the sync machinery

complete -c crio -n '__fish_crio_no_subcommand' -f -l apparmor-profile -r -d 'Name of the apparmor profile to be used as the runtime\'s default. This only takes effect if the user does not specify a profile via the Kubernetes Pod\'s metadata annotation.'
complete -c crio -n '__fish_crio_no_subcommand' -f -l bind-mount-prefix -r -d 'A prefix to use for the source of the bind mounts. This option would be useful if you were running CRI-O in a container. And had `/` mounted on `/host` in your container. Then if you ran CRI-O with the `--bind-mount-prefix=/host` option, CRI-O would add /host to any bind mounts it is handed over CRI. If Kubernetes asked to have `/var/lib/foobar` bind mounted into the container, then CRI-O would bind mount `/host/var/lib/foobar`. Since CRI-O itself is running in a container with `/` or the host mounted on `/host`, the container would end up with `/var/lib/foobar` from the host mounted in the container rather then `/var/lib/foobar` from the CRI-O container. (default: "")'
complete -c crio -n '__fish_crio_no_subcommand' -f -l cgroup-manager -r -d 'cgroup manager (cgroupfs or systemd)'
complete -c crio -n '__fish_crio_no_subcommand' -l clean-shutdown-file -r -d 'Location for CRI-O to lay down the clean shutdown file. It indicates whether we\'ve had time to sync changes to disk before shutting down. If not, crio wipe will clear the storage directory'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe?

Suggested change
complete -c crio -n '__fish_crio_no_subcommand' -l clean-shutdown-file -r -d 'Location for CRI-O to lay down the clean shutdown file. It indicates whether we\'ve had time to sync changes to disk before shutting down. If not, crio wipe will clear the storage directory'
complete -c crio -n '__fish_crio_no_subcommand' -l clean-shutdown-file -r -d 'Location for CRI-O to lay down the clean shutdown file. It indicates whether we\'ve had time to sync changes to disk before shutting down. If not found, crio wipe will clear the storage directory'

docs/crio.8.md Outdated

**--cgroup-manager**="": cgroup manager (cgroupfs or systemd) (default: systemd)

**--clean-shutdown-file**="": Location for CRI-O to lay down the clean shutdown file. It indicates whether we've had time to sync changes to disk before shutting down. If not, crio wipe will clear the storage directory (default: /var/lib/crio/clean.shutdown)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you take found above, add it here too

**clean_shutdown_file**="/var/lib/crio/clean.shutdown"
Location for CRI-O to lay down the clean shutdown file.
It is used to check whether crio had time to sync before shutting down.
If not, crio wipe will clear the storage directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe found here too

},
&cli.StringFlag{
Name: "clean-shutdown-file",
Usage: "Location for CRI-O to lay down the clean shutdown file. It indicates whether we've had time to sync changes to disk before shutting down. If not, crio wipe will clear the storage directory",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto found

@haircommander haircommander force-pushed the clean-shutdown-1.18 branch 2 times, most recently from e84579d to 28241c6 Compare July 21, 2020 15:42
there are cases where crio doesn't get the chance to sync before shutdown.
In these cases, container storage can be corrupted.
We need to protect against this case by wiping all of storage if we detect we didn't cleanly shutdown.

Add an option to specify a clean_shutdown_file that crio will create upon syncing at shutdown
Add an option to crio-wipe to clear all of storage if that file is not present
Add integration tests to verify

Signed-off-by: Peter Hunt <[email protected]>
@codecov
Copy link

codecov bot commented Jul 21, 2020

Codecov Report

Merging #3984 into release-1.18 will decrease coverage by 0.09%.
The diff coverage is 15.62%.

@@               Coverage Diff                @@
##           release-1.18    #3984      +/-   ##
================================================
- Coverage         40.82%   40.72%   -0.10%     
================================================
  Files               106      106              
  Lines              8703     8734      +31     
================================================
+ Hits               3553     3557       +4     
- Misses             4837     4860      +23     
- Partials            313      317       +4     

Copy link
Member

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: haircommander, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [haircommander,saschagrunert]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@saschagrunert
Copy link
Member

/retest

1 similar comment
@saschagrunert
Copy link
Member

/retest

@saschagrunert
Copy link
Member

/retest

@rhatdan
Copy link
Contributor

rhatdan commented Jul 22, 2020

/test kata-containers

@saschagrunert
Copy link
Member

/retest

@saschagrunert
Copy link
Member

saschagrunert commented Jul 22, 2020

Hm, kata seems broken unfortunately:

11:02:26 #   `OVERRIDE_OPTIONS="--additional-devices /dev/null:/dev/qifoo:rwm" start_crio' failed
11:02:26 # time="2020-07-22T11:02:22Z" level=error msg="error opening storage: /dev/sdb is already part of a volume group \"storage\": must remove this device from any volume group or provide a different device"
11:02:26 # time="2020-07-22T11:02:24Z" level=fatal msg="failed to connect: failed to connect, make sure you are running as root and the runtime has been started: context deadline exceeded"
11:02:26 # time="2020-07-22T11:02:26Z" level=fatal msg="failed to connect: failed to connect, make sure you are running as root and the runtime has been started: context deadline exceeded"

@haircommander
Copy link
Member Author

/test kata-containers

@openshift-ci-robot
Copy link

@haircommander: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/kata-jenkins 564915e link /test kata-containers

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link

@haircommander: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/kata-jenkins 564915e link /test kata-containers
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@umohnani8
Copy link
Member

LGTM

@haircommander
Copy link
Member Author

/hold

let's get master version in first, and let it sit a bit

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 22, 2020
@sdodson
Copy link

sdodson commented Jul 30, 2020

let's get master version in first, and let it sit a bit

Where "a bit" is along the lines of days/weeks of soak time and careful scrutiny please. Master PR is #3999 just for anyone else who tracks the 4.6 BZ to this PR and wonders where the master branch PR is.

@haircommander
Copy link
Member Author

I do not think we want this anymore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. release-note Denotes a PR that will be considered when it comes time to generate release notes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants