-
Notifications
You must be signed in to change notification settings - Fork 868
Split up the buffer reservation API #2031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
kazuho
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this.
Splitting h2o_buffer_reserve to two functions is fine, but I am not sure if think we are changing the correct invocations to try-reserve. The function calls that we should change are the ones that allocate unbound amount of memory (i.e., the request body buffer used when streaming request is off, the response body buffer in each handler being used when streaming response is off).
lib/http2/connection.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. IIRC the upper limit here is 16KB (maximum HTTP/2 frame size that we accept).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed. H2O_HTTP2_SETTINGS_HOST_MAX_FRAME_SIZE is enforced when decoding the frame in h2o_http2_decode_frame(). Violation of this results in H2O_HTTP2_ERROR_FRAME_SIZE. Thanks for the exercise!
lib/http2/connection.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC the upper size of conn->_write.buf is capped by something slightly above 64KB.
|
I think we might also want to log the amount of memory we tried to allocate as well as the file and the line number of the caller, when allocation fails. That has become possible thanks to #2020. Maybe you want to make that change in this PR? (we can do it after this PR gets merged). |
|
Thank you for the review. Logging the amount of memory sounds fantastic. I'll rebase this branch on master and take advantage of @chenbd's neat work. I'll also address your other comments as well. Until then... 👋 |
|
Comments addressed. As for EDIT: Here's what the new fatal output looks like: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes. I think we are making good progress, but also have some more work to do.
In my previous review, I stated, quote: the function calls that we should change are the ones that allocate unbound amount of memory (i.e., the request body buffer used when streaming request is off, the response body buffer in each handler being used when streaming response is off).
What I was trying to suggest is that, you need to check the invocations of h2o_buffer_reserve across the entire source tree (not just limited to HTTP/2), because you are changing the behavior of the call-sites by changing what the function does, along with explaining how you might try to determine what you need to do for each of such invocations.
I understand that it's going to be tough, but it's something we need to do in order to land this PR.
I'll rebase this branch on master
Generally speaking, I'd appreciate it if you could rather push additional commits, because rebase destroys the context of PR reviews, as well as causes divergence between the local repository and the h2o repository should you have pulled the PR from refs/pull/<no> branch.
lib/http2/connection.c
Outdated
|
|
||
| { /* send SETTINGS and connection-level WINDOW_UPDATE */ | ||
| h2o_iovec_t vec = h2o_buffer_reserve(&conn->_write.buf, SERVER_PREFACE.len); | ||
| h2o_iovec_t vec = h2o_buffer_try_reserve(&conn->_write.buf, SERVER_PREFACE.len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you missed this.
lib/http2/connection.c
Outdated
| if (conn->_http1_req_input->size > reqsize) { | ||
| size_t remaining_bytes = conn->_http1_req_input->size - reqsize; | ||
| h2o_buffer_reserve(&sock->input, remaining_bytes); | ||
| if ((h2o_buffer_try_reserve(&sock->input, remaining_bytes)).base == NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would appreciate it if you could determine if we need to make this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gladly. My take is that the upper-bound on H2 upgrade is protected by limit-request-body. In fact, we won't even make it to h2o_http2_handle_upgrade() if the entity size exceeds the limit, thanks to the entity size validation earlier in the codepath (specifically around entity_readers).
Here's a quick illustration with an H2O proxy that has a small limit-request-body setting:
> POST / HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.54.0
> Accept: */*
> Connection: Upgrade, HTTP2-Settings
> Upgrade: h2c
> HTTP2-Settings: AAMAAABkAARAAAAAAAIAAAAA
> Content-Length: 1175
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
< HTTP/1.1 413 Request Entity Too Large
< Connection: close
< Content-Length: 27
< Server: h2o/2.3.0-DEV@e9799f71
< content-type: text/plain; charset=utf-8
My conclusion is: No, we don't need this change. I'll fix it along with the one I missed earlier. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: b275c7a -- I'm now moving on to tracking down unbounded allocation requests throughout the source tree.
|
Thanks for the follow up! I see your point about going through the entire tree. I admit it's beyond what I initially anticipated but I agree that it must be done. Exactly this 😅:
Also ack regarding your preference to keep stacking commits. |
h2o_buffer_try_reserve should only be used for allocations on unbounded amount of memory. Buffer allocations in H2 connection so far isn't applicable.
|
For posterity,
|
|
@kazuho I was able to verify that it's possible to grow the H2 request buffer against handlers that do not support streaming in
If the second point is valid, then it would be nice to change the API of |
Basically the same as e33175c, but for the HTTP/1 frontend. The h2o_buffer_append in the http2 code has been reverted, because the error check makes sense given the existing API.
The fcgi response (up to 64KB) is accumulated into the response buffer using h2o_buffer_reserve. Instead use h2o_buffer_try_reserve. Other invocations of h2o_buffer_reserve can remain as it is because the upper bound of the allocation size is known to be 64KB.
While this change might be too defensive, I was able to trigger an mmap call by preparing a large directory on the APFS file system, which is comparable to other file systems in this particular context. I admit that my test case is unorthodox but at the same time, I can't say that it's impossible for an innocent user to have a big directory.
|
@toru I think your observation is correct. Regarding the function names and the return values, the high order agreement is that we should split the buffer functions to one that dies on allocation failure and one that returns an error. And I prefer renaming the functions that return allocation failures to include "try" in their names, because it is a fact that we've sometimes forgot to check the return values of the functions. Having "try" would help us avoid the problem. The flip side of that is that changing the semantics (including the return type) of WDYT? |
|
Thank you for the follow up! I agree with your view that having the word |
This buffer grows proportionally to the number of requests that are in-flight, therefore the max buffer size is unbounded. As a result of this change, h1/h2's foreach_request function can fail early hence the error check.
h2o_buffer_try_reserve does not abort on mmap failure, therefore check the return value and bubble up the error.
|
All |
This change ensures that the behavior on buffer allocation failure remains the same at the socket layer. Even though the allocation amount is fixed to 4096-bytes inside the event loop backends, calls to h2o_buffer_reserve was edited to use h2o_buffer_try_reserve to allow the event library to handle what to do on allocation failure.
Buffer allocation in send_chunk_method() need to be replaced with h2o_buffer_try_reserve, because the amount of memory allocated grows proportionally to the size of the output. On the other hand, the buffer allocation in post_error() can remain as-is, because the allocation size is determined by short constant strings.
|
@kazuho I think this PR is ready for another glance. I'm fairly confident that I was able to address most (if not all) of the unbounded allocations. That said, there is a chance that I might have missed a case due to the sparse nature of the changeset. |
Split up the buffer reservation API
|
Thank you for your patience. Merged to master. I made some changes, please see the commits. |
Objective
Ensure that unbounded memory allocation failures (e.g. buffers in non-streaming code path) are checked and handled peacefully. For bounded amount of memory allocations (e.g. low layer frame handling), abort with information of the underlying h2o_buffer_t, and the allocation amount.
Design
h2o_buffer_reservethat is forgiving on mmap failureh2o_buffer_appendthat is forgiving on mmap failureh2o_buffer_(reserve|append)to abort on mmap failureReplace unbounded
h2o_buffer_(reserve|append)calls with its forgiving variant, check the allocation result, and handle accordingly.