-
Notifications
You must be signed in to change notification settings - Fork 155
Improve error handling #613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Fixes my testcase, thanks: |
hannesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me
| if (ret != wr->len) { | ||
| /* Report that it failed somehow */ | ||
| wr->ret = SOLO5_R_EUNSPEC; | ||
| return; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes the semantics so the unikernel can continue if the write fails. Which is perhaps okay. It's hard for the unikernel to act on it, though(?!)
From what I can read from man -s2 pwrite write it can return less if there is not enough disk space. Considering the file is checked to have a fixed size from the beginning I can only imagine this happening if the file is a) sparse, or b) has been truncated(!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s always unclear whether there’s anything useful to do when things are starting to fail when they really shouldn’t, indeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not entirely clear to me whether it is worth to handle this situation gracefully. What is wrong about assertion failure and the unikernel exiting?
Handling this properly from the unikernel side may require some further information, i.e. transport the data from errno from the tender to the unikernel, and then deciding what to do in the unikernel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed I'd leave it as assertion failure for now, and if there's a convincing case how to handle such a pwrite error, we need to read errno and transport it to the unikernel so it can decide what to do. The same for the block read, and also network read and write.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion failure can create a coredump on the host, which allows a unikernel to consume (potentially unbounded) disk space on the host. This could get especially bad if due to a bug it'd immediately crash again in the same place after a restart.
I think it'd be fine if the unikernel (tender) exited with a fatal error in this case, as long as it doesn't actually crash with a coredump, e.g. it could print an error message and exit with a specific exitcode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏾 sounds good to me to (f)printf a message and exit(1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the PR accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I only put a minor comment to use strerror()
| if (ret != rd->len) { | ||
| /* Report that it failed somehow */ | ||
| rd->ret = SOLO5_R_EUNSPEC; | ||
| return; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can only happen if the file was truncated(?)
979bf67 to
026cafa
Compare
96b86eb to
2214db4
Compare
|
Would you mind to rebase and squash, so the CI has a chance to run this PR? Also, is there a very small ELF binary that you could add for testing? |
It is ~33KiB (attaching Zipped because github doesn't allow me to attach the binary directly) |
| return; | ||
| } | ||
| assert(ret > 0); | ||
| if (ret <= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with #559 in mind, I think we can allow ret == 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's keep that for a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! you’re right, that’s a bit silly. I’ll fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's leave it here as it was (to avoid changing semantics here), and discuss in #618 where @dinosaure has comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. (I fixed at least the silly message.)
Co-authored-by: Edwin Török <[email protected]>
Stop `assert`-ing conditions that could invalidated by specially crafted ELF files. Use sensible errors instead.
Avoid `assert`-ing a condition that is controlled by the unikernel
2214db4 to
14b3884
Compare
hannesm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI is green as well. Test has been added.
|
I added Edwin’s torture test in a first commit, to ensure that running the tests in that version yields: |
14b3884 to
c2354b3
Compare
If an I/O operation failed on the host (whatever the reason), display an explicit message about it and exit normally instead of aborting on an assertion failure This hardens the hvt tender against unikernels which could try to make it fail an assertion (to try and create core dumps, if they are enabled)
c2354b3 to
1ffc06a
Compare
|
Thanks for your work. |
- Fix compilation on OpenBSD about `#ifdef` (@omegametabroccolo, @hannesm, @reynir, Solo5/solo5#614, related with Solo5/solo5#600) - Add GitHub actions to test Solo5 on different platforms (@hannesm, Solo5/solo5#616, Solo5/solo5#617 & Solo5/solo5#540) - Do not use `-Wstringopt-overflow` when we use `clang` for `test_ssp` (@hannesm, Solo5/solo5#607) - Show errors and exit instead of `assert false` (@shym, @edwintorok, @hannesm, @reynir, Solo5/solo5#613 & Solo5/solo5#612) - Define `.note.GNU-stack` when it's needed (@shym, @Kensan, @hannesm, @palainp, @dinosaure, Solo5/solo5#619, Solo5/solo5#570 & Solo5/solo5#562) - Detect MTU on TAP interface for hvt, spt and virtio (@reynir, @hannesm, @dinosaure, Solo5/solo5#605 & Solo5/solo5#606)
This PR replaces a couple of
asserts with a proper, or at least better, handling of the corresponding error. In particular, it replacesasserts that check conditions that can, or maybe could, be controlled by the unikernel, either in its file format (hardening the ELF parser) or in the hypercalls it triggers. It also takes the opportunity to report (unspecified) error to the unikernels when an I/O operation failed on the host.This is addressing #612, hopefully.