Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@deweerdt
Copy link
Member

This commit introduces the infrastructure needed to stream request bodies
for http1, we have the two readers, chunked and content-length use the
h2o_req_t::write_req.cb to emit body chunks.

This commit introduces the infrastructure needed to stream request bodies
for http1, we have the two readers, chunked and content-length use the
h2o_req_t::write_req.cb to emit body chunks.
@deweerdt deweerdt changed the title Http1 streaming request bodies [http1] streaming request bodies Mar 15, 2019
Copy link
Member

@kazuho kazuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. I haven't gone through the code, but I noticed following points that you might want to look into.

Regarding the test failure, I am not sure what's happening. IIUC the test that fails now does not involve a request body, and therefore should continue to work fine, assuming that the behavior of the http1 handler has not changed for requests without bodies. Am I missing something here?

@deweerdt
Copy link
Member Author

Thank you @i110 and @kazuho I didn't realize the error failure was coming from the non-streaming case. Addressed. I've also addressed the issue pointed out by @kazuho above. I believe this can be cleaned up by merging some of the streaming infrastructure used by http1 and http2. I'll take a stab at that.

Copy link
Member

@kazuho kazuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR.

The high level design looks fine, and I like how you've refactored the code so that the some of the logic is shared between the http1 stack and http2 stack. I think we'd might want to make further tweaks (including @i110's point that the selection between streaming and non-streaming mode does not need to be delayed), but I am fine with doing it in a separate PR if that's preferable for you.

Regarding the changes to HTTP/1 stack, I think I have found some corner cases (see below). They make me wonder what we should do with the lack of tests covering the error cases and pipelining. Would you be interested in writing them, or do you want somebody more familiar to perl to work on the issue (thinking of @i110 or myself)?

lib/http1.c Outdated
/* all input has arrived */
conn->req.entity = h2o_iovec_init(conn->sock->input->bytes + conn->_reqsize - reader->content_length, reader->content_length);
on_entity_read_complete(conn);
handle_one_body_fragment(conn, conn->sock->input->size, complete);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this process excessive amount of input as request body if the last input consists of the end of the request body and the first few bytes of the next request (i.e. pipelining)?

Please correct me if I'm incorrect, and if we have a test covering such case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, i've added a test in fb8e95b that shows the previous code was broken. This is fixed now.

lib/http1.c Outdated
return;
}

if (conn->req.proceed_req) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd appreciate it if you could add != NULL. We apply operators so that the result would be a boolean, unless the input (being an integral value) is considered a boolean (due to lack of a built-in boolean type in C).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adressed in c3bcd0d thank you.

lib/http1.c Outdated
if (conn->req.proceed_req) {
conn->_req_entity_reader = NULL;
set_timeout(conn, 0, NULL);
h2o_socket_read_stop(conn->sock);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is the right moment. Don't we need to disarm the timeout and stop reading additional bytes every time when we pass something to the handler? Otherwise, the amount of data we buffer cannot be capped.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in e4d894b

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e4d894b looks good to me, but would you mind elaborating why we need to stop reading (or consider the case of conn->req.proceed_req being non-NULL) in this function, after consulting the value of conn->req.http1_is_persistent?

It is my understanding that cleanup_connection is called only when the HTTP/1 stack successfully processes a request and sending a response. Assuming that is still the case, I think proceed_req and _req_entity_reader must have been set to NULL, and reading from the socket should have been stopped. Maybe what we need are assertions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added assertions, thanks for the suggestion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added assertions, thanks for the suggestion.

And the assertions found that 603c731 is necessary because h2o_send_error can complete the request without closing the connection.

…-encoding:chunked

While the behavior is technically correct, dropping content-length:
loses information that might be valuable to other peers down the path.

We fix this by making sure that `req.content_length` is correctly
assigned for http/1 when available. It was already correctly initialized
in the http/2 case.
connection to the origin is established, effecitively disabling streaming
@deweerdt deweerdt force-pushed the http1-streaming-request-bodies branch from becf577 to fbec251 Compare April 9, 2019 16:33
@deweerdt deweerdt force-pushed the http1-streaming-request-bodies branch from fbec251 to fa4fa74 Compare April 9, 2019 16:38
@deweerdt
Copy link
Member Author

deweerdt commented Apr 9, 2019

@kazuho i believe fa4fa74 addresses all your comments so far. I also have a test branch that integrates with @i110 's #2010 which looks good.

Copy link
Member

@kazuho kazuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes. Looks mostly fine to me, some nitpicks below. PTAL.

include/h2o.h Outdated

typedef void (*h2o_proceed_req_cb)(h2o_req_t *req, size_t written, int is_end_stream);
typedef int (*h2o_write_req_cb)(void *ctx, h2o_iovec_t chunk, int is_end_stream);
typedef void (*h2o_on_body_streaming_selected_cb)(h2o_req_t *, int streaming);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might suggest changing the typename and the prototype argument to something like:

typedef void (*h2o_on_request_streaming_selected_cb)(h2o_req_t *req, int is_streaming);
  • By changing "body_streaming" to "request_streaming", we avoid the confusion that this is related to response streaming. We can omit "body", because body is the only thing we stream.
  • Adding "is_" prefix better aligns the callback type with other functions that also accept boolean arguments (see right above).

Note also the the support of the feature is indicated by a property named supports_request_streaming in h2o_handler_t.

include/h2o.h Outdated
struct {
h2o_write_req_cb cb;
void *ctx;
h2o_on_body_streaming_selected_cb on_body_streaming_selected;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the attribute can be write_req.on_streaming_selected instead of write_req.on_body_streaming_selected if we follow the rules stated above.

include/h2o.h Outdated
struct {
size_t bytes_received;
h2o_buffer_t *body;
} _body;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/_body/_req_body/

lib/http1.c Outdated
entity_read_send_error_502(conn, "Bad Gateway", "Bad Gateway");
return;
}
h2o_buffer_consume(&conn->sock->input, consume);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can omit the third argument of the function and do h2o_buffer_consume(&conn->sock->input, fragment_size) here.

The only case where fragment_size and consume are different in the current code is when chunked encoding is used. I'd argue that for such a case, the caller can invoke h2o_buffer_consume at first to consume the chunk header, then invoke handle_one_body_fragment to just process the payload of the chunk.

h2o_buffer_consume is a fast function, and processing of a chunked request is not a cold path (because only clients that know that the server supports HTTP/1.1 would use it).

lib/http1.c Outdated
set_timeout(conn, 0, NULL);
h2o_socket_read_stop(conn->sock);
process_request(conn);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should inline-expand the body of this function in handle_one_body_fragment (that's the only call-site), preferably before we call write_req.

lib/http1.c Outdated
return 0;
}

static void on_body_streaming_selected(h2o_req_t *req, int streaming)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/streaming/is_streaming/

}

static int write_req_first(void *_req, h2o_iovec_t payload, int is_end_stream)
static void on_body_streaming_selected(h2o_req_t *req, int streaming)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/on_body_streaming_selected/on_request_streaming_selected/
s/int streaming/int is_streaming/

lib/http1.c Outdated
uint64_t _req_index;
size_t _prevreqlen;
size_t _reqsize;
size_t _headers_size;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would appreciate it if you could either change the name of the variable to something more appropriate, or add a comment explaining how the value is used.

This is because the value does not always represent the size of the header fields with the proposed change (when the request has a body, it becomes zero). It is my understanding that the amount of data left un-consumed in the socket read buffer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone with _unconsumed_request_size

lib/http1.c Outdated
if (conn->req.proceed_req) {
conn->_req_entity_reader = NULL;
set_timeout(conn, 0, NULL);
h2o_socket_read_stop(conn->sock);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e4d894b looks good to me, but would you mind elaborating why we need to stop reading (or consider the case of conn->req.proceed_req being non-NULL) in this function, after consulting the value of conn->req.http1_is_persistent?

It is my understanding that cleanup_connection is called only when the HTTP/1 stack successfully processes a request and sending a response. Assuming that is still the case, I think proceed_req and _req_entity_reader must have been set to NULL, and reading from the socket should have been stopped. Maybe what we need are assertions.

- s/_body/_req_body/
- s/on_body_streaming_selected/on_streaming_selected/
- s/h2o_on_body_streaming_selected_cb/h2o_on_request_streaming_selected_cb/
@deweerdt deweerdt force-pushed the http1-streaming-request-bodies branch from 603c731 to 765027f Compare April 13, 2019 01:58
@deweerdt
Copy link
Member Author

@kazuho i believe that 765027f addresses all you comments so far. A change that wasn't covered in the previous exchange is the connection close on the proxy 502s. Any thoughts on that?

@kazuho
Copy link
Member

kazuho commented Apr 15, 2019

A change that wasn't covered in the previous exchange is the connection close on the proxy 502s. Any thoughts on that?

IIUC, the proposed change is to suggest closing the connection from proxy.c whenever it is impossible to connect to the origin. I am not sure if that is the correct thing to do.

At the moment, we close a H1 connection only when the framing becomes corrupt (i.e. when there would be a risk of splitting attack unless we close the connection). In case of an origin sending 502, the framing is not necessarily corrupt. It is beneficial to keep the connection open especially when only some of the requests are routed to an origin, while other requests being served by H2O itself.

I think we might want to consider the case you are trying to fix as part of #2010. The proxy handler returning 502 due to not being able to connect to an origin while the request from the client being inflight is a particular form of returning an early response. I am not sure of how we should handle the error, but my instinct is that it's something not unique to the proxy handler, but rather something general to all the handlers.

@kazuho
Copy link
Member

kazuho commented Apr 15, 2019

@deweerdt Regarding my previous comment, I asked @i110 on how he deals with the issue in H2, and his answer was than in #2010 the stream is closed by the proceed_req callback:

stream_send_error(conn, stream->stream_id, H2O_HTTP2_ERROR_NONE);

I think this approach might be something we should adopt in H1 as well (or change to a different approach in both H1 and H2 code). WDYT?

@kazuho kazuho mentioned this pull request Apr 15, 2019
10 tasks
@deweerdt
Copy link
Member Author

I think this approach might be something we should adopt in H1 as well (or change to a different approach in both H1 and H2 code). WDYT?

That would work for me. Would you prefer a combined PR with #2007 and #2010 ?

@kazuho
Copy link
Member

kazuho commented Apr 15, 2019

That would work for me. Would you prefer a combined PR with #2007 and #2010 ?

That's a good question. Yeah, let's tackle the issues independently. I think we can land the code here (that uses H2O_SEND_ERROR_HTTP1_CLOSE_CONNECTION) as-is, and fix it later in #2010 or in a follow-up PR of #2010.

Using H2O_SEND_ERROR_HTTP1_CLOSE_CONNECTION has efficiency issues, but works perfect as a short-term solution.

@deweerdt
Copy link
Member Author

That's a good question. Yeah, let's tackle the issues independently. I think we can land the code here (that uses H2O_SEND_ERROR_HTTP1_CLOSE_CONNECTION) as-is, and fix it later in #2010 or in a follow-up PR of #2010.

That works. There's code more or less ready anyway, since i've tested an earlier version of this PR+2010 to help review #2010.

@kazuho kazuho merged commit c14554e into h2o:master Apr 16, 2019
@kazuho
Copy link
Member

kazuho commented Apr 16, 2019

Thank you for working on this complex PR. Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants