From 3c37a5e8a3df302805f7e46a24e93277056b47a2 Mon Sep 17 00:00:00 2001 From: Ashwin Menon Date: Thu, 1 May 2025 14:34:13 -0700 Subject: [PATCH 1/2] Add 'Expect: 100-Continue' support to httplib Previously, http.client would always send content body immediately and ignore any 100 responses. This change makes HTTPClient.request() wait for a `Continue` response if the `Expect: 100-Continue` header is set, and adds a parameter to HTTPClient.getresponse() that will cause it to return 100 responses instead of eating them. --- Doc/library/http.client.rst | 84 ++++++++++- Lib/http/client.py | 89 +++++++++--- Lib/test/test_httplib.py | 137 ++++++++++++++++++ Misc/ACKS | 1 + ...2019-07-22-00-57-26.bpo-1346874.TentEE.rst | 2 + 5 files changed, 287 insertions(+), 26 deletions(-) create mode 100644 Misc/NEWS.d/next/Library/2019-07-22-00-57-26.bpo-1346874.TentEE.rst diff --git a/Doc/library/http.client.rst b/Doc/library/http.client.rst index 2835c8d0eb711e..af84633bbb0aff 100644 --- a/Doc/library/http.client.rst +++ b/Doc/library/http.client.rst @@ -34,7 +34,7 @@ The module provides the following classes: .. class:: HTTPConnection(host, port=None[, timeout], source_address=None, \ - blocksize=8192) + blocksize=8192, continue_timeout=2.5) An :class:`HTTPConnection` instance represents one transaction with an HTTP server. It should be instantiated by passing it a host and optional port @@ -46,7 +46,10 @@ The module provides the following classes: The optional *source_address* parameter may be a tuple of a (host, port) to use as the source address the HTTP connection is made from. The optional *blocksize* parameter sets the buffer size in bytes for - sending a file-like message body. + sending a file-like message body. Finally, the optional *continue_timeout* + parameter controls how long the connection will wait for a ``100 Continue`` + response from the server before sending the body, if the request included + an ``Expect: 100-Continue`` header. For example, the following calls all create instances that connect to the server at the same host and port:: @@ -66,6 +69,9 @@ The module provides the following classes: .. versionchanged:: 3.7 *blocksize* parameter was added. + .. versionchanged:: 3.14 + *continue_timeout* parameter was added. + .. class:: HTTPSConnection(host, port=None, *[, timeout], \ source_address=None, context=None, \ @@ -321,11 +327,24 @@ HTTPConnection Objects No attempt is made to determine the Content-Length for file objects. -.. method:: HTTPConnection.getresponse() + .. versionchanged:: 3.14 + If the headers include ``Expect: 100-Continue`` and *body* is set, + the body will not be sent until either a ``100 Continue`` response is + received from the server or the :class:`HTTPConnection`'s *continue_timeout* + is reached; if a response code other than 100 is received, the body + will not be sent at all. + +.. method:: HTTPConnection.getresponse(ignore_100_continue=True) Should be called after a request is sent to get the response from the server. Returns an :class:`HTTPResponse` instance. + By default, a server response of ``100 Continue`` will be ignored and the + call will not return until a response other than code 100 is received. + Setting *ignore_100_continue* to ``False`` will allow a 100 response to be + returned; you will then need to call :meth:`getresponse` a second time + (after transmitting the body) to get the final response. + .. note:: Note that you must have read the whole response before you can send a new @@ -336,6 +355,10 @@ HTTPConnection Objects :class:`HTTPConnection` object will be ready to reconnect when a new request is sent. + .. versionchanged:: 3.14 + Added the *ignore_100_continue* parameter. (In prior versions + a ``Continue`` response was always ignored.) + .. method:: HTTPConnection.set_debuglevel(level) @@ -439,11 +462,17 @@ also send your request step by step, by using the four functions below. an argument. -.. method:: HTTPConnection.endheaders(message_body=None, *, encode_chunked=False) +.. method:: HTTPConnection.endheaders(message_body=None, *, encode_chunked=False, \ + expect_continue=False) Send a blank line to the server, signalling the end of the headers. The optional *message_body* argument can be used to pass a message body - associated with the request. + associated with the request. If a body is provided, setting + *expect_continue* to ``True`` will wait for a ``100 Continue`` response + from the server before sending the body. (This should generally be + used only when an ``Expect: 100-Continue`` header has been sent.) If no + response is received within the :class:`HTTPConnection`'s *continue_timeout*, + the body will be sent regardless. If *encode_chunked* is ``True``, the result of each iteration of *message_body* will be chunk-encoded as specified in :rfc:`7230`, @@ -464,6 +493,9 @@ also send your request step by step, by using the four functions below. .. versionchanged:: 3.6 Added chunked encoding support and the *encode_chunked* parameter. + .. versionadded:: 3.14 + The *expect_continue* parameter was added. + .. method:: HTTPConnection.send(data) @@ -642,6 +674,48 @@ method attribute. Here is an example session that uses the ``PUT`` method:: >>> print(response.status, response.reason) 200, OK +Since version 3.14, conditional transmission of the body is supported when an +``Expect: 100-Continue`` header is set. To use this in a simple case, just +set the header, and optionally the time for which the client should wait for +a ``100 Continue`` response before sending the body regardless:: + + >>> import http.client + >>> BODY = "***filecontents***" + >>> conn = http.client.HTTPConnection("localhost", 8080, continue_timeout=1.0) + >>> conn.request("PUT", "/file", BODY, headers={'Expect': '100-Continue'}) + >>> response = conn.getresponse() + >>> # You will not see the '100' response, as it is handled internally + >>> print(response.status, response.reason) + 200, OK + +Here is a more complex example in which we manually check the response and +decide whether to send the body. This may be useful if the body must be +generated by some resource-intensive process which should be skipped if the +server will not accept it. :: + + >>> import http.client + >>> conn = http.client.HTTPConnection("localhost", 8080) + >>> conn.putrequest("PUT", "/file") + >>> conn.putheader('Expect', '100-Continue') + >>> # Assuming you know in advance what the length will be + >>> # If not, you will need to encode it as chunked + >>> conn.putheader('Content-Length', '42') + >>> conn.endheaders() + >>> response = conn.getresponse(ignore_100_continue=False) + >>> print(response.status, response.reason) + 100, Continue + >>> BODY = resource_intensive_calculation() + >>> conn.send(BODY) + >>> response = conn.getresponse() + >>> print(response.status, response.reason) + 200, OK + +.. note:: + + The *continue_timeout* setting does not apply when directly using + :meth:`HTTPConnection.getresponse`, so use the above example only if you are confident + that the server respects the ``Expect: 100-Continue`` header. + .. _httpmessage-objects: HTTPMessage Objects diff --git a/Lib/http/client.py b/Lib/http/client.py index 33a858d34ae1ba..eae678b4326466 100644 --- a/Lib/http/client.py +++ b/Lib/http/client.py @@ -74,6 +74,7 @@ import http import io import re +import select import socket import sys import collections.abc @@ -321,15 +322,16 @@ def _read_status(self): raise BadStatusLine(line) return version, status, reason - def begin(self): + def begin(self, ignore_100_continue=True): if self.headers is not None: # we've already started reading the response return # read until we get a non-100 response + # (unless caller has requested return of 100 responses) while True: version, status, reason = self._read_status() - if status != CONTINUE: + if not (ignore_100_continue and status == CONTINUE): break # skip the header from the 100 response skipped_headers = _read_headers(self.fp) @@ -865,13 +867,15 @@ def _get_content_length(body, method): return None def __init__(self, host, port=None, timeout=socket._GLOBAL_DEFAULT_TIMEOUT, - source_address=None, blocksize=8192): + source_address=None, blocksize=8192, continue_timeout=2.5): self.timeout = timeout + self.continue_timeout = continue_timeout self.source_address = source_address self.blocksize = blocksize self.sock = None self._buffer = [] self.__response = None + self.__pending_response = None self.__state = _CS_IDLE self._method = None self._tunnel_host = None @@ -883,9 +887,10 @@ def __init__(self, host, port=None, timeout=socket._GLOBAL_DEFAULT_TIMEOUT, self._validate_host(self.host) - # This is stored as an instance variable to allow unit - # tests to replace it with a suitable mockup + # These are stored as instance variables to allow unit + # tests to replace them with a suitable mockup self._create_connection = socket.create_connection + self._select = select.select def set_tunnel(self, host, port=None, headers=None): """Set up host and port for HTTP CONNECT tunnelling. @@ -1021,6 +1026,10 @@ def close(self): self.sock = None sock.close() # close it manually... there may be other refs finally: + pending_response = self.__pending_response + if pending_response: + self.__pending_response = None + pending_response.close() response = self.__response if response: self.__response = None @@ -1081,7 +1090,8 @@ def _read_readable(self, readable): datablock = datablock.encode("iso-8859-1") yield datablock - def _send_output(self, message_body=None, encode_chunked=False): + def _send_output(self, message_body=None, encode_chunked=False, + expect_continue=False): """Send the currently buffered request and clear the buffer. Appends an extra \\r\\n to the buffer. @@ -1093,6 +1103,23 @@ def _send_output(self, message_body=None, encode_chunked=False): self.send(msg) if message_body is not None: + if expect_continue and not self.__response: + read_ready, _, _ = self._select([self.sock], [], [], + self.continue_timeout) + if read_ready: + if self.debuglevel > 0: + response = self.response_class(self.sock, + self.debuglevel, + method=self._method) + else: + response = self.response_class(self.sock, + method=self._method) + response.begin(ignore_100_continue=False) + if response.code != CONTINUE: + # Break without sending the body + self.__pending_response = response + return + response.close() # create a consistent interface to message_body if hasattr(message_body, 'read'): @@ -1319,7 +1346,8 @@ def putheader(self, header, *values): header = header + b': ' + value self._output(header) - def endheaders(self, message_body=None, *, encode_chunked=False): + def endheaders(self, message_body=None, *, encode_chunked=False, + expect_continue=False): """Indicate that the last header line has been sent to the server. This method sends the request to the server. The optional message_body @@ -1330,7 +1358,8 @@ def endheaders(self, message_body=None, *, encode_chunked=False): self.__state = _CS_REQ_SENT else: raise CannotSendHeader() - self._send_output(message_body, encode_chunked=encode_chunked) + self._send_output(message_body, encode_chunked=encode_chunked, + expect_continue=expect_continue) def request(self, method, url, body=None, headers={}, *, encode_chunked=False): @@ -1375,20 +1404,33 @@ def _send_request(self, method, url, body, headers, encode_chunked): else: encode_chunked = False + # If the Expect: 100-continue header is set, we will try to honor it + # (if possible). We can only do so if 1) the request has a body, and + # 2) there is no current incomplete response (since we need to read + # the response stream to check if the code is 100 or not) + expect_continue = ( + body and not self.__response + and 'expect' in header_names + and '100-continue' in {v.lower() for k, v in headers.items() + if k.lower() == 'expect'} + ) + for hdr, value in headers.items(): self.putheader(hdr, value) if isinstance(body, str): # RFC 2616 Section 3.7.1 says that text default has a # default charset of iso-8859-1. body = _encode(body, 'body') - self.endheaders(body, encode_chunked=encode_chunked) + self.endheaders(body, encode_chunked=encode_chunked, + expect_continue=expect_continue) - def getresponse(self): + def getresponse(self, ignore_100_continue=True): """Get the response from the server. If the HTTPConnection is in the correct state, returns an - instance of HTTPResponse or of whatever object is returned by - the response_class variable. + instance of HTTPResponse or of whatever object is returned by the + response_class variable. The connection will wait for a response other + than code 100 ('Continue') unless ignore_100_continue is set to False. If a request has not been sent or if a previous response has not be handled, ResponseNotReady is raised. If the HTTP @@ -1419,7 +1461,10 @@ def getresponse(self): if self.__state != _CS_REQ_SENT or self.__response: raise ResponseNotReady(self.__state) - if self.debuglevel > 0: + if self.__pending_response: + response = self.__pending_response + self.__pending_response = None + elif self.debuglevel > 0: response = self.response_class(self.sock, self.debuglevel, method=self._method) else: @@ -1427,19 +1472,21 @@ def getresponse(self): try: try: - response.begin() + response.begin(ignore_100_continue=ignore_100_continue) except ConnectionError: self.close() raise assert response.will_close != _UNKNOWN - self.__state = _CS_IDLE + if response.code != 100: + # Code 100 is effectively 'not a response' for this purpose + self.__state = _CS_IDLE - if response.will_close: - # this effectively passes the connection to the response - self.close() - else: - # remember this, so we can tell when it is complete - self.__response = response + if response.will_close: + # this effectively passes the connection to the response + self.close() + else: + # remember this, so we can tell when it is complete + self.__response = response return response except: diff --git a/Lib/test/test_httplib.py b/Lib/test/test_httplib.py index 38429ad480ff1c..51e573b2f98c1c 100644 --- a/Lib/test/test_httplib.py +++ b/Lib/test/test_httplib.py @@ -6,6 +6,7 @@ import os import array import re +import select import socket import threading @@ -125,6 +126,7 @@ def __init__(self, *args): super().__init__('example.com') self.fake_socket_args = args self._create_connection = self.create_connection + self._select = self.fake_select def connect(self): """Count the number of times connect() is invoked""" @@ -134,6 +136,21 @@ def connect(self): def create_connection(self, *pos, **kw): return FakeSocket(*self.fake_socket_args) + def fake_select(self, rlist, wlist, xlist, timeout=None): + """Real select.select() won't work on our fake socket""" + # Only needs to support read select for current tests + ret = [] + for r in rlist: + assert isinstance(r, FakeSocket) + if r.data and not hasattr(r, 'file'): + ret.append(r) # Has data to read, hasn't called makefile() yet + elif getattr(r, 'file', None) and not r.file_closed: + if r.file.read(1): + r.file.seek(r.file.tell() - 1) + ret.append(r) + return ret, [], [] + + class HeaderTests(TestCase): def test_auto_headers(self): # Some headers are added automatically, but should not be added by @@ -2216,6 +2233,126 @@ def test_binary_file_body(self): self.assertEqual(b'5\r\nbody\xc1\r\n0\r\n\r\n', f.read()) +class ExpectContinueTests(TestCase): + + def test_data_sent_on_continue(self): + # Prior versions would pass this too, as they send body immediately + # and ignore the Continue, but make sure it still behaves as expected. + conn = FakeSocketHTTPConnection( + b'HTTP/1.1 100 Continue\r\n' + b'\r\n' + b'HTTP/1.1 200 OK\r\n' + b'Content-Length: 0\r\n' + b'\r\n' + ) + conn.request('PUT', '/', headers={'Expect': '100-continue'}, + body=b'Body content') + resp = conn.getresponse() + self.assertEqual(resp.code, 200) + self.assertIn(b'Expect: 100-continue', conn.sock.data) + self.assertIn(b'Body content', conn.sock.data) + + def test_data_not_sent_on_error(self): + conn = FakeSocketHTTPConnection( + b'HTTP/1.1 429 Too Many Requests\r\n' + b'Content-Length: 0\r\n' + b'\r\n' + ) + conn.request('PUT', '/', headers={'Expect': '100-continue'}, + body=b'Body content') + resp = conn.getresponse() + self.assertEqual(resp.code, 429) + self.assertIn(b'Expect: 100-continue', conn.sock.data) + self.assertNotIn(b'Body content', conn.sock.data) + + def test_body_sent_on_timeout(self): + def client_thread(port): + conn = client.HTTPConnection('localhost', port, + continue_timeout=0.75) + conn.request('PUT', '/', + headers={'Expect': '100-continue'}, + body=b'Body content') + resp = conn.getresponse() + self.assertEqual(resp.code, 200) + resp.close() + conn.close() + + with socket.socket() as sock: + sock.bind(('localhost', 0)) + sock.listen(1) + t = threading.Thread(target=client_thread, + args=(sock.getsockname()[1],)) + t.start() + conn, _ = sock.accept() + req_data = conn.recv(4096) + self.assertTrue(req_data.startswith(b'PUT / HTTP/1.1\r\n')) + self.assertIn(b'Expect: 100-continue\r\n', req_data) + self.assertIn(b'Content-Length: 12\r\n', req_data) + self.assertNotIn(b'Body content', req_data) + # Client should not send body data yet + rr, _, _ = select.select([conn], [], [], 0.5) + self.assertEqual(rr, []) + # Client should time out and send body data in another ~0.25s + rr, _, _ = select.select([conn], [], [], 0.5) + self.assertEqual(rr, [conn]) + body = conn.recv(4096) + self.assertEqual(body, b'Body content') + conn.sendall( + b'HTTP/1.1 200 OK\r\n' + b'Content-Length: 0\r\n' + b'\r\n' + ) + conn.close() + sock.close() + t.join() + + def test_manual_getresponse(self): + def client_thread(port): + conn = client.HTTPConnection('localhost', port, + continue_timeout=0) + conn.putrequest("PUT", "/file") + conn.putheader('Expect', '100-Continue') + conn.putheader('Content-Length', '42') + conn.endheaders() + with conn.getresponse(ignore_100_continue=False) as resp: + self.assertEqual(resp.status, 100) + conn.send(b'Go away or I shall taunt you a second time') + with conn.getresponse() as resp: + self.assertEqual(resp.status, 200) + conn.close() + + with socket.socket() as sock: + sock.bind(('localhost', 0)) + sock.listen(1) + t = threading.Thread(target=client_thread, + args=(sock.getsockname()[1],)) + t.start() + conn, _ = sock.accept() + req_data = conn.recv(4096) + self.assertTrue(req_data.startswith(b'PUT /file HTTP/1.1\r\n')) + self.assertIn(b'Expect: 100-Continue\r\n', req_data) + self.assertIn(b'Content-Length: 42\r\n', req_data) + self.assertNotIn(b'I shall taunt you', req_data) + # Client is not expected to send body data until we respond, + # regardless of continue_timeout + rr, _, _ = select.select([conn], [], [], 1.0) + self.assertEqual(rr, []) + conn.sendall(b'HTTP/1.1 100 Continue\r\n\r\n') + rr, _, _ = select.select([conn], [], [], 0.5) + self.assertEqual(rr, [conn]) + b = conn.recv(42) + self.assertEqual(b, b'Go away or I shall taunt you a second time') + conn.sendall( + b'HTTP/1.1 200 OK\r\n' + b'Content-Length: 0\r\n' + b'\r\n' + ) + conn.close() + sock.close() + t.join() + + + class HTTPResponseTest(TestCase): def setUp(self): diff --git a/Misc/ACKS b/Misc/ACKS index 42068ec6aefbd2..c5f9d632424a1c 100644 --- a/Misc/ACKS +++ b/Misc/ACKS @@ -1210,6 +1210,7 @@ Daniel May Madison May Lucas Maystre Arnaud Mazin +Ashwin Menon Pam McA'Nulty Matt McClure Jack McCracken diff --git a/Misc/NEWS.d/next/Library/2019-07-22-00-57-26.bpo-1346874.TentEE.rst b/Misc/NEWS.d/next/Library/2019-07-22-00-57-26.bpo-1346874.TentEE.rst new file mode 100644 index 00000000000000..9ede4743c19018 --- /dev/null +++ b/Misc/NEWS.d/next/Library/2019-07-22-00-57-26.bpo-1346874.TentEE.rst @@ -0,0 +1,2 @@ +``http.client.HTTPConnection`` now waits for a ``100 Continue`` response +before sending body content if the ``Expect: 100-Continue`` header is set. From 532026d696881befb44a4ed887a15e27a3cd50c2 Mon Sep 17 00:00:00 2001 From: IngridMorstrad Date: Wed, 21 May 2025 15:23:46 -0700 Subject: [PATCH 2/2] Update version this will go into Co-authored-by: Peter Bierma --- Doc/library/http.client.rst | 10 +++++----- .../Library/2019-07-22-00-57-26.bpo-1346874.TentEE.rst | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/Doc/library/http.client.rst b/Doc/library/http.client.rst index af84633bbb0aff..04c1a5a4232b71 100644 --- a/Doc/library/http.client.rst +++ b/Doc/library/http.client.rst @@ -69,7 +69,7 @@ The module provides the following classes: .. versionchanged:: 3.7 *blocksize* parameter was added. - .. versionchanged:: 3.14 + .. versionchanged:: next *continue_timeout* parameter was added. @@ -327,7 +327,7 @@ HTTPConnection Objects No attempt is made to determine the Content-Length for file objects. - .. versionchanged:: 3.14 + .. versionchanged:: next If the headers include ``Expect: 100-Continue`` and *body* is set, the body will not be sent until either a ``100 Continue`` response is received from the server or the :class:`HTTPConnection`'s *continue_timeout* @@ -355,7 +355,7 @@ HTTPConnection Objects :class:`HTTPConnection` object will be ready to reconnect when a new request is sent. - .. versionchanged:: 3.14 + .. versionchanged:: next Added the *ignore_100_continue* parameter. (In prior versions a ``Continue`` response was always ignored.) @@ -493,7 +493,7 @@ also send your request step by step, by using the four functions below. .. versionchanged:: 3.6 Added chunked encoding support and the *encode_chunked* parameter. - .. versionadded:: 3.14 + .. versionadded:: next The *expect_continue* parameter was added. @@ -674,7 +674,7 @@ method attribute. Here is an example session that uses the ``PUT`` method:: >>> print(response.status, response.reason) 200, OK -Since version 3.14, conditional transmission of the body is supported when an +Since version 3.15, conditional transmission of the body is supported when an ``Expect: 100-Continue`` header is set. To use this in a simple case, just set the header, and optionally the time for which the client should wait for a ``100 Continue`` response before sending the body regardless:: diff --git a/Misc/NEWS.d/next/Library/2019-07-22-00-57-26.bpo-1346874.TentEE.rst b/Misc/NEWS.d/next/Library/2019-07-22-00-57-26.bpo-1346874.TentEE.rst index 9ede4743c19018..fbb7b58bc2ec90 100644 --- a/Misc/NEWS.d/next/Library/2019-07-22-00-57-26.bpo-1346874.TentEE.rst +++ b/Misc/NEWS.d/next/Library/2019-07-22-00-57-26.bpo-1346874.TentEE.rst @@ -1,2 +1,2 @@ -``http.client.HTTPConnection`` now waits for a ``100 Continue`` response +:class:`http.client.HTTPConnection` now waits for a ``100 Continue`` response before sending body content if the ``Expect: 100-Continue`` header is set.