Split up the buffer reservation API #2031

toru · 2019-04-10T22:30:46Z

Objective

Ensure that unbounded memory allocation failures (e.g. buffers in non-streaming code path) are checked and handled peacefully. For bounded amount of memory allocations (e.g. low layer frame handling), abort with information of the underlying h2o_buffer_t, and the allocation amount.

Design

Add a variant of h2o_buffer_reserve that is forgiving on mmap failure
Add a variant of h2o_buffer_append that is forgiving on mmap failure
h2o_buffer_(reserve|append) to abort on mmap failure

Replace unbounded h2o_buffer_(reserve|append) calls with its forgiving variant, check the allocation result, and handle accordingly.

kazuho

Thank you for working on this.

Splitting h2o_buffer_reserve to two functions is fine, but I am not sure if think we are changing the correct invocations to try-reserve. The function calls that we should change are the ones that allocate unbound amount of memory (i.e., the request body buffer used when streaming request is off, the response body buffer in each handler being used when streaming response is off).

include/h2o/memory.h

lib/http2/connection.c

kazuho · 2019-04-11T05:20:52Z

lib/http2/connection.c

Same as above. IIRC the upper limit here is 16KB (maximum HTTP/2 frame size that we accept).

Confirmed. H2O_HTTP2_SETTINGS_HOST_MAX_FRAME_SIZE is enforced when decoding the frame in h2o_http2_decode_frame(). Violation of this results in H2O_HTTP2_ERROR_FRAME_SIZE. Thanks for the exercise!

kazuho · 2019-04-11T05:21:39Z

lib/http2/connection.c

IIRC the upper size of conn->_write.buf is capped by something slightly above 64KB.

kazuho · 2019-04-11T06:26:52Z

I think we might also want to log the amount of memory we tried to allocate as well as the file and the line number of the caller, when allocation fails. That has become possible thanks to #2020.

Maybe you want to make that change in this PR? (we can do it after this PR gets merged).

toru · 2019-04-11T16:04:01Z

Thank you for the review. Logging the amount of memory sounds fantastic. I'll rebase this branch on master and take advantage of @chenbd's neat work. I'll also address your other comments as well. Until then... 👋

toru · 2019-04-11T22:15:10Z

Comments addressed. As for h2o_fatal()'ing with allocation info, I took a simple approach of printing inbuf's capacity and min_guaranteed. This approach doesn't expose the details of the failure under the hood in h2o_buffer_try_reserve(), but in practice it should be adequate to infer the scale and cause of the allocation failure.

EDIT: Here's what the new fatal output looks like:

fatal:/Users/toru/Projects/h2o/lib/common/memory.c:238:failed to reserve buffer; capacity: 8192, min_gurantee: 4096
received fatal signal 6

kazuho

Thank you for the changes. I think we are making good progress, but also have some more work to do.

In my previous review, I stated, quote: the function calls that we should change are the ones that allocate unbound amount of memory (i.e., the request body buffer used when streaming request is off, the response body buffer in each handler being used when streaming response is off).

What I was trying to suggest is that, you need to check the invocations of h2o_buffer_reserve across the entire source tree (not just limited to HTTP/2), because you are changing the behavior of the call-sites by changing what the function does, along with explaining how you might try to determine what you need to do for each of such invocations.

I understand that it's going to be tough, but it's something we need to do in order to land this PR.

I'll rebase this branch on master

Generally speaking, I'd appreciate it if you could rather push additional commits, because rebase destroys the context of PR reviews, as well as causes divergence between the local repository and the h2o repository should you have pulled the PR from refs/pull/<no> branch.

kazuho · 2019-04-11T23:20:58Z

lib/http2/connection.c


    { /* send SETTINGS and connection-level WINDOW_UPDATE */
-        h2o_iovec_t vec = h2o_buffer_reserve(&conn->_write.buf, SERVER_PREFACE.len);
+        h2o_iovec_t vec = h2o_buffer_try_reserve(&conn->_write.buf, SERVER_PREFACE.len);


I think you missed this.

kazuho · 2019-04-11T23:21:23Z

lib/http2/connection.c

    if (conn->_http1_req_input->size > reqsize) {
        size_t remaining_bytes = conn->_http1_req_input->size - reqsize;
-        h2o_buffer_reserve(&sock->input, remaining_bytes);
+        if ((h2o_buffer_try_reserve(&sock->input, remaining_bytes)).base == NULL) {


I would appreciate it if you could determine if we need to make this change.

Gladly. My take is that the upper-bound on H2 upgrade is protected by limit-request-body. In fact, we won't even make it to h2o_http2_handle_upgrade() if the entity size exceeds the limit, thanks to the entity size validation earlier in the codepath (specifically around entity_readers).

Here's a quick illustration with an H2O proxy that has a small limit-request-body setting:

> POST / HTTP/1.1 > Host: localhost:8080 > User-Agent: curl/7.54.0 > Accept: */* > Connection: Upgrade, HTTP2-Settings > Upgrade: h2c > HTTP2-Settings: AAMAAABkAARAAAAAAAIAAAAA > Content-Length: 1175 > Content-Type: application/x-www-form-urlencoded > Expect: 100-continue > < HTTP/1.1 413 Request Entity Too Large < Connection: close < Content-Length: 27 < Server: h2o/2.3.0-DEV@e9799f71 < content-type: text/plain; charset=utf-8

My conclusion is: No, we don't need this change. I'll fix it along with the one I missed earlier. Thanks!

Done: b275c7a -- I'm now moving on to tracking down unbounded allocation requests throughout the source tree.

toru · 2019-04-12T01:11:00Z

Thanks for the follow up! I see your point about going through the entire tree. I admit it's beyond what I initially anticipated but I agree that it must be done. Exactly this 😅:

I understand that it's going to be tough, but it's something we need to do in order to land this PR.

Also ack regarding your preference to keep stacking commits.

h2o_buffer_try_reserve should only be used for allocations on unbounded amount of memory. Buffer allocations in H2 connection so far isn't applicable.

toru · 2019-04-15T22:24:03Z

For posterity, lib/common/http2client.c can remain as it is because invocations in:

expect_continuation_of_headers (httpclient.c:370) is bounded to at most 16384 bytes (H2O_HTTP2_SETTINGS_CLIENT_MAX_FRAME_SIZE)
handle_headers_frame (httpclient.c:498) is bounded to at most 16384 bytes (H2O_HTTP2_SETTINGS_CLIENT_MAX_FRAME_SIZE)
handle_settings_frame (httpclient.c:604) allocates exactly 9 bytes (H2O_HTTP2_FRAME_HEADER_SIZE)
stream_emit_pending_data (httpclient.c:1068) is bounded to at most H2O_HTTP2_FRAME_HEADER_SIZE + max_payload_size where max_payload_size is bounded to be at most 16777215 by h2o_http2_update_peer_settings
send_client_preface (httpclient.c:1213) is bounded to sizeof(PREFIX) - 1 + 4

toru · 2019-04-18T05:45:56Z

@kazuho I was able to verify that it's possible to grow the H2 request buffer against handlers that do not support streaming in write_req_non_streaming. I'd really appreciate your feedback on e33175c before proceeding any further. In summary:

split h2o_buffer_append like we did with h2o_buffer_reserve
h2o_buffer_append aborts on allocation failure, whereas h2o_buffer_try_append will provide an opportunity to handle the "unbounded" allocation failure
use the new h2o_buffer_try_append in write_req_non_streaming

If the second point is valid, then it would be nice to change the API of h2o_buffer_append to be a void function, but at this time I decided not to touch the interface. This topic IMHO deserves a separate discussion because an API change shouldn't be taken lightly.

Basically the same as e33175c, but for the HTTP/1 frontend. The h2o_buffer_append in the http2 code has been reverted, because the error check makes sense given the existing API.

The fcgi response (up to 64KB) is accumulated into the response buffer using h2o_buffer_reserve. Instead use h2o_buffer_try_reserve. Other invocations of h2o_buffer_reserve can remain as it is because the upper bound of the allocation size is known to be 64KB.

While this change might be too defensive, I was able to trigger an mmap call by preparing a large directory on the APFS file system, which is comparable to other file systems in this particular context. I admit that my test case is unorthodox but at the same time, I can't say that it's impossible for an innocent user to have a big directory.

kazuho · 2019-04-19T00:43:20Z

@toru I think your observation is correct.

Regarding the function names and the return values, the high order agreement is that we should split the buffer functions to one that dies on allocation failure and one that returns an error.

And I prefer renaming the functions that return allocation failures to include "try" in their names, because it is a fact that we've sometimes forgot to check the return values of the functions. Having "try" would help us avoid the problem. The flip side of that is that changing the semantics (including the return type) of h2o_buffer_reserve and h2o_buffer_append is fine.

WDYT?

toru · 2019-04-19T03:36:54Z

Thank you for the follow up! I agree with your view that having the word try_ in the function name would effectively prompt the developer to check the return value. I think this is a powerful fundamental benefit for the project. It'll be worth the semantic changes. This credit goes to @deweerdt, not me :)

This buffer grows proportionally to the number of requests that are in-flight, therefore the max buffer size is unbounded. As a result of this change, h1/h2's foreach_request function can fail early hence the error check.

h2o_buffer_try_reserve does not abort on mmap failure, therefore check the return value and bubble up the error.

toru · 2019-04-22T05:07:03Z

All h2o_buffer_reserve calls in http2client appears to have an upper bound, and because the calls are for essential protocol processing, I feel ok to leave it as it is.

This change ensures that the behavior on buffer allocation failure remains the same at the socket layer. Even though the allocation amount is fixed to 4096-bytes inside the event loop backends, calls to h2o_buffer_reserve was edited to use h2o_buffer_try_reserve to allow the event library to handle what to do on allocation failure.

Buffer allocation in send_chunk_method() need to be replaced with h2o_buffer_try_reserve, because the amount of memory allocated grows proportionally to the size of the output. On the other hand, the buffer allocation in post_error() can remain as-is, because the allocation size is determined by short constant strings.

toru · 2019-04-25T07:52:50Z

@kazuho I think this PR is ready for another glance. I'm fairly confident that I was able to address most (if not all) of the unbounded allocations. That said, there is a chance that I might have missed a case due to the sparse nature of the changeset.

Split up the buffer reservation API

kazuho · 2019-06-12T02:59:25Z

Thank you for your patience. Merged to master. I made some changes, please see the commits.

kazuho reviewed Apr 11, 2019

View reviewed changes

toru added 2 commits April 11, 2019 11:40

memory: add an unforgiving way to reserve a buffer

daeaf90

http2: handle memory allocation failure on the frontend

9026448

toru force-pushed the buffer-reserve branch from 382b4ce to 9026448 Compare April 11, 2019 20:02

memory: write allocation info when aborting

e9799f7

kazuho reviewed Apr 11, 2019

View reviewed changes

http2: continue using h2o_buffer_reserve

b275c7a

h2o_buffer_try_reserve should only be used for allocations on unbounded amount of memory. Buffer allocations in H2 connection so far isn't applicable.

http2: use h2p_buffer_try_append on non-streaming req buffer

e33175c

toru added 4 commits April 18, 2019 00:38

Merge branch 'master' into buffer-reserve

a616fda

http1: use h2o_buffer_append when non-streaming

94605b2

Basically the same as e33175c, but for the HTTP/1 frontend. The h2o_buffer_append in the http2 code has been reverted, because the error check makes sense given the existing API.

toru force-pushed the buffer-reserve branch from e28e5f7 to 2d24c52 Compare April 18, 2019 23:23

toru added 2 commits April 21, 2019 00:15

status: use h2o_buffer_try_reserve for req buffer

7c3032d

This buffer grows proportionally to the number of requests that are in-flight, therefore the max buffer size is unbounded. As a result of this change, h1/h2's foreach_request function can fail early hence the error check.

fcgi: handle memory allocation failure in append_content

0505c88

h2o_buffer_try_reserve does not abort on mmap failure, therefore check the return value and bubble up the error.

toru added 2 commits April 22, 2019 00:11

toru changed the title ~~Tweak the buffer reservation API for defensive memory handling~~ Split up the buffer reservation API Apr 25, 2019

kazuho merged commit 083f945 into h2o:master Jun 12, 2019

kazuho added a commit that referenced this pull request Jun 12, 2019

Merge pull request #2031

9a8e07e

Split up the buffer reservation API

Split up the buffer reservation API #2031

Split up the buffer reservation API #2031

Uh oh!

Conversation

toru commented Apr 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Objective

Design

Uh oh!

kazuho left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kazuho Apr 11, 2019

Choose a reason for hiding this comment

Uh oh!

toru Apr 11, 2019

Choose a reason for hiding this comment

Uh oh!

kazuho Apr 11, 2019

Choose a reason for hiding this comment

Uh oh!

kazuho commented Apr 11, 2019

Uh oh!

toru commented Apr 11, 2019

Uh oh!

toru commented Apr 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kazuho left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kazuho Apr 11, 2019

Choose a reason for hiding this comment

Uh oh!

kazuho Apr 11, 2019

Choose a reason for hiding this comment

Uh oh!

toru Apr 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

toru Apr 15, 2019

Choose a reason for hiding this comment

Uh oh!

toru commented Apr 12, 2019

Uh oh!

toru commented Apr 15, 2019

Uh oh!

toru commented Apr 18, 2019

Uh oh!

kazuho commented Apr 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

toru commented Apr 19, 2019

Uh oh!

toru commented Apr 22, 2019

Uh oh!

toru commented Apr 25, 2019

Uh oh!

kazuho commented Jun 12, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

toru commented Apr 10, 2019 •

edited

Loading

toru commented Apr 11, 2019 •

edited

Loading

kazuho left a comment •

edited

Loading

toru Apr 15, 2019 •

edited

Loading

kazuho commented Apr 19, 2019 •

edited

Loading