Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit ef91bb2

Browse files
committed
Issue #12319: Always send file request bodies using chunked encoding
The previous attempt to determine the file’s Content-Length gave a false positive for pipes on Windows. Also, drop the special case for sending zero-length iterable bodies.
1 parent 8f96a30 commit ef91bb2

7 files changed

Lines changed: 96 additions & 82 deletions

File tree

Doc/library/http.client.rst

Lines changed: 15 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -240,17 +240,17 @@ HTTPConnection Objects
240240
The *headers* argument should be a mapping of extra HTTP headers to send
241241
with the request.
242242

243-
If *headers* contains neither Content-Length nor Transfer-Encoding, a
244-
Content-Length header will be added automatically if possible. If
243+
If *headers* contains neither Content-Length nor Transfer-Encoding,
244+
but there is a request body, one of those
245+
header fields will be added automatically. If
245246
*body* is ``None``, the Content-Length header is set to ``0`` for
246247
methods that expect a body (``PUT``, ``POST``, and ``PATCH``). If
247-
*body* is a string or bytes-like object, the Content-Length header is
248-
set to its length. If *body* is a binary :term:`file object`
249-
supporting :meth:`~io.IOBase.seek`, this will be used to determine
250-
its size. Otherwise, the Content-Length header is not added
251-
automatically. In cases where determining the Content-Length up
252-
front is not possible, the body will be chunk-encoded and the
253-
Transfer-Encoding header will automatically be set.
248+
*body* is a string or a bytes-like object that is not also a
249+
:term:`file <file object>`, the Content-Length header is
250+
set to its length. Any other type of *body* (files
251+
and iterables in general) will be chunk-encoded, and the
252+
Transfer-Encoding header will automatically be set instead of
253+
Content-Length.
254254

255255
The *encode_chunked* argument is only relevant if Transfer-Encoding is
256256
specified in *headers*. If *encode_chunked* is ``False``, the
@@ -260,19 +260,18 @@ HTTPConnection Objects
260260
.. note::
261261
Chunked transfer encoding has been added to the HTTP protocol
262262
version 1.1. Unless the HTTP server is known to handle HTTP 1.1,
263-
the caller must either specify the Content-Length or must use a
264-
body representation whose length can be determined automatically.
263+
the caller must either specify the Content-Length, or must pass a
264+
:class:`str` or bytes-like object that is not also a file as the
265+
body representation.
265266

266267
.. versionadded:: 3.2
267268
*body* can now be an iterable.
268269

269270
.. versionchanged:: 3.6
270271
If neither Content-Length nor Transfer-Encoding are set in
271-
*headers* and Content-Length cannot be determined, *body* will now
272-
be automatically chunk-encoded. The *encode_chunked* argument
273-
was added.
274-
The Content-Length for binary file objects is determined with seek.
275-
No attempt is made to determine the Content-Length for text file
272+
*headers*, file and iterable *body* objects are now chunk-encoded.
273+
The *encode_chunked* argument was added.
274+
No attempt is made to determine the Content-Length for file
276275
objects.
277276

278277
.. method:: HTTPConnection.getresponse()

Doc/library/urllib.request.rst

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -187,12 +187,11 @@ The following classes are provided:
187187
server, or ``None`` if no such data is needed. Currently HTTP
188188
requests are the only ones that use *data*. The supported object
189189
types include bytes, file-like objects, and iterables. If no
190-
``Content-Length`` header has been provided, :class:`HTTPHandler` will
191-
try to determine the length of *data* and set this header accordingly.
192-
If this fails, ``Transfer-Encoding: chunked`` as specified in
193-
:rfc:`7230`, Section 3.3.1 will be used to send the data. See
194-
:meth:`http.client.HTTPConnection.request` for details on the
195-
supported object types and on how the content length is determined.
190+
``Content-Length`` nor ``Transfer-Encoding`` header field
191+
has been provided, :class:`HTTPHandler` will set these headers according
192+
to the type of *data*. ``Content-Length`` will be used to send
193+
bytes objects, while ``Transfer-Encoding: chunked`` as specified in
194+
:rfc:`7230`, Section 3.3.1 will be used to send files and other iterables.
196195

197196
For an HTTP POST request method, *data* should be a buffer in the
198197
standard :mimetype:`application/x-www-form-urlencoded` format. The
@@ -256,8 +255,8 @@ The following classes are provided:
256255

257256
.. versionchanged:: 3.6
258257
Do not raise an error if the ``Content-Length`` has not been
259-
provided and could not be determined. Fall back to use chunked
260-
transfer encoding instead.
258+
provided and *data* is neither ``None`` nor a bytes object.
259+
Fall back to use chunked transfer encoding instead.
261260

262261
.. class:: OpenerDirector()
263262

Doc/whatsnew/3.6.rst

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -579,8 +579,8 @@ The :class:`~unittest.mock.Mock` class has the following improvements:
579579
urllib.request
580580
--------------
581581

582-
If a HTTP request has a non-empty body but no Content-Length header
583-
and the content length cannot be determined up front, rather than
582+
If a HTTP request has a file or iterable body (other than a
583+
bytes object) but no Content-Length header, rather than
584584
throwing an error, :class:`~urllib.request.AbstractHTTPHandler` now
585585
falls back to use chunked transfer encoding.
586586
(Contributed by Demian Brecht and Rolf Krahl in :issue:`12319`.)
@@ -935,6 +935,13 @@ Changes in the Python API
935935
This behavior has also been backported to earlier Python versions
936936
by Setuptools 26.0.0.
937937

938+
* In the :mod:`urllib.request` module and the
939+
:meth:`http.client.HTTPConnection.request` method, if no Content-Length
940+
header field has been specified and the request body is a file object,
941+
it is now sent with HTTP 1.1 chunked encoding. If a file object has to
942+
be sent to a HTTP 1.0 server, the Content-Length value now has to be
943+
specified by the caller. See :issue:`12319`.
944+
938945
Changes in the C API
939946
--------------------
940947

Lib/http/client.py

Lines changed: 8 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -805,35 +805,21 @@ def _is_textIO(stream):
805805
def _get_content_length(body, method):
806806
"""Get the content-length based on the body.
807807
808-
If the body is "empty", we set Content-Length: 0 for methods
809-
that expect a body (RFC 7230, Section 3.3.2). If the body is
810-
set for other methods, we set the header provided we can
811-
figure out what the length is.
808+
If the body is None, we set Content-Length: 0 for methods that expect
809+
a body (RFC 7230, Section 3.3.2). We also set the Content-Length for
810+
any method if the body is a str or bytes-like object and not a file.
812811
"""
813-
if not body:
812+
if body is None:
814813
# do an explicit check for not None here to distinguish
815814
# between unset and set but empty
816-
if method.upper() in _METHODS_EXPECTING_BODY or body is not None:
815+
if method.upper() in _METHODS_EXPECTING_BODY:
817816
return 0
818817
else:
819818
return None
820819

821820
if hasattr(body, 'read'):
822821
# file-like object.
823-
if HTTPConnection._is_textIO(body):
824-
# text streams are unpredictable because it depends on
825-
# character encoding and line ending translation.
826-
return None
827-
else:
828-
# Is it seekable?
829-
try:
830-
curpos = body.tell()
831-
sz = body.seek(0, io.SEEK_END)
832-
except (TypeError, AttributeError, OSError):
833-
return None
834-
else:
835-
body.seek(curpos)
836-
return sz - curpos
822+
return None
837823

838824
try:
839825
# does it implement the buffer protocol (bytes, bytearray, array)?
@@ -1266,8 +1252,7 @@ def _send_request(self, method, url, body, headers, encode_chunked):
12661252
# the caller passes encode_chunked=True or the following
12671253
# conditions hold:
12681254
# 1. content-length has not been explicitly set
1269-
# 2. the length of the body cannot be determined
1270-
# (e.g. it is a generator or unseekable file)
1255+
# 2. the body is a file or iterable, but not a str or bytes-like
12711256
# 3. Transfer-Encoding has NOT been explicitly set by the caller
12721257

12731258
if 'content-length' not in header_names:
@@ -1280,7 +1265,7 @@ def _send_request(self, method, url, body, headers, encode_chunked):
12801265
encode_chunked = False
12811266
content_length = self._get_content_length(body, method)
12821267
if content_length is None:
1283-
if body:
1268+
if body is not None:
12841269
if self.debuglevel > 0:
12851270
print('Unable to determine size of %r' % body)
12861271
encode_chunked = True

Lib/test/test_httplib.py

Lines changed: 19 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -381,6 +381,16 @@ def test_request(self):
381381
# same request
382382
self.assertNotIn('content-length', [k.lower() for k in headers])
383383

384+
def test_empty_body(self):
385+
# Zero-length iterable should be treated like any other iterable
386+
conn = client.HTTPConnection('example.com')
387+
conn.sock = FakeSocket(b'')
388+
conn.request('POST', '/', ())
389+
_, headers, body = self._parse_request(conn.sock.data)
390+
self.assertEqual(headers['Transfer-Encoding'], 'chunked')
391+
self.assertNotIn('content-length', [k.lower() for k in headers])
392+
self.assertEqual(body, b"0\r\n\r\n")
393+
384394
def _make_body(self, empty_lines=False):
385395
lines = self.expected_body.split(b' ')
386396
for idx, line in enumerate(lines):
@@ -652,7 +662,9 @@ def test_too_many_headers(self):
652662

653663
def test_send_file(self):
654664
expected = (b'GET /foo HTTP/1.1\r\nHost: example.com\r\n'
655-
b'Accept-Encoding: identity\r\nContent-Length:')
665+
b'Accept-Encoding: identity\r\n'
666+
b'Transfer-Encoding: chunked\r\n'
667+
b'\r\n')
656668

657669
with open(__file__, 'rb') as body:
658670
conn = client.HTTPConnection('example.com')
@@ -1717,7 +1729,7 @@ def test_bytes_body(self):
17171729
self.assertEqual("5", message.get("content-length"))
17181730
self.assertEqual(b'body\xc1', f.read())
17191731

1720-
def test_file_body(self):
1732+
def test_text_file_body(self):
17211733
self.addCleanup(support.unlink, support.TESTFN)
17221734
with open(support.TESTFN, "w") as f:
17231735
f.write("body")
@@ -1726,10 +1738,8 @@ def test_file_body(self):
17261738
message, f = self.get_headers_and_fp()
17271739
self.assertEqual("text/plain", message.get_content_type())
17281740
self.assertIsNone(message.get_charset())
1729-
# Note that the length of text files is unpredictable
1730-
# because it depends on character encoding and line ending
1731-
# translation. No content-length will be set, the body
1732-
# will be sent using chunked transfer encoding.
1741+
# No content-length will be determined for files; the body
1742+
# will be sent using chunked transfer encoding instead.
17331743
self.assertIsNone(message.get("content-length"))
17341744
self.assertEqual("chunked", message.get("transfer-encoding"))
17351745
self.assertEqual(b'4\r\nbody\r\n0\r\n\r\n', f.read())
@@ -1743,8 +1753,9 @@ def test_binary_file_body(self):
17431753
message, f = self.get_headers_and_fp()
17441754
self.assertEqual("text/plain", message.get_content_type())
17451755
self.assertIsNone(message.get_charset())
1746-
self.assertEqual("5", message.get("content-length"))
1747-
self.assertEqual(b'body\xc1', f.read())
1756+
self.assertEqual("chunked", message.get("Transfer-Encoding"))
1757+
self.assertNotIn("Content-Length", message)
1758+
self.assertEqual(b'5\r\nbody\xc1\r\n0\r\n\r\n', f.read())
17481759

17491760

17501761
class HTTPResponseTest(TestCase):

Lib/test/test_urllib2.py

Lines changed: 35 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -913,40 +913,50 @@ def test_http(self):
913913
self.assertEqual(req.unredirected_hdrs["Spam"], "foo")
914914

915915
def test_http_body_file(self):
916-
# A regular file - Content Length is calculated unless already set.
916+
# A regular file - chunked encoding is used unless Content Length is
917+
# already set.
917918

918919
h = urllib.request.AbstractHTTPHandler()
919920
o = h.parent = MockOpener()
920921

921922
file_obj = tempfile.NamedTemporaryFile(mode='w+b', delete=False)
922923
file_path = file_obj.name
923-
file_obj.write(b"Something\nSomething\nSomething\n")
924924
file_obj.close()
925+
self.addCleanup(os.unlink, file_path)
925926

926-
for headers in {}, {"Content-Length": 30}:
927-
with open(file_path, "rb") as f:
928-
req = Request("http://example.com/", f, headers)
929-
newreq = h.do_request_(req)
930-
self.assertEqual(int(newreq.get_header('Content-length')), 30)
927+
with open(file_path, "rb") as f:
928+
req = Request("http://example.com/", f, {})
929+
newreq = h.do_request_(req)
930+
te = newreq.get_header('Transfer-encoding')
931+
self.assertEqual(te, "chunked")
932+
self.assertFalse(newreq.has_header('Content-length'))
931933

932-
os.unlink(file_path)
934+
with open(file_path, "rb") as f:
935+
req = Request("http://example.com/", f, {"Content-Length": 30})
936+
newreq = h.do_request_(req)
937+
self.assertEqual(int(newreq.get_header('Content-length')), 30)
938+
self.assertFalse(newreq.has_header("Transfer-encoding"))
933939

934940
def test_http_body_fileobj(self):
935-
# A file object - Content Length is calculated unless already set.
941+
# A file object - chunked encoding is used
942+
# unless Content Length is already set.
936943
# (Note that there are some subtle differences to a regular
937944
# file, that is why we are testing both cases.)
938945

939946
h = urllib.request.AbstractHTTPHandler()
940947
o = h.parent = MockOpener()
941-
942948
file_obj = io.BytesIO()
943-
file_obj.write(b"Something\nSomething\nSomething\n")
944949

945-
for headers in {}, {"Content-Length": 30}:
946-
file_obj.seek(0)
947-
req = Request("http://example.com/", file_obj, headers)
948-
newreq = h.do_request_(req)
949-
self.assertEqual(int(newreq.get_header('Content-length')), 30)
950+
req = Request("http://example.com/", file_obj, {})
951+
newreq = h.do_request_(req)
952+
self.assertEqual(newreq.get_header('Transfer-encoding'), 'chunked')
953+
self.assertFalse(newreq.has_header('Content-length'))
954+
955+
headers = {"Content-Length": 30}
956+
req = Request("http://example.com/", file_obj, headers)
957+
newreq = h.do_request_(req)
958+
self.assertEqual(int(newreq.get_header('Content-length')), 30)
959+
self.assertFalse(newreq.has_header("Transfer-encoding"))
950960

951961
file_obj.close()
952962

@@ -959,9 +969,7 @@ def test_http_body_pipe(self):
959969
h = urllib.request.AbstractHTTPHandler()
960970
o = h.parent = MockOpener()
961971

962-
cmd = [sys.executable, "-c",
963-
r"import sys; "
964-
r"sys.stdout.buffer.write(b'Something\nSomething\nSomething\n')"]
972+
cmd = [sys.executable, "-c", r"pass"]
965973
for headers in {}, {"Content-Length": 30}:
966974
with subprocess.Popen(cmd, stdout=subprocess.PIPE) as proc:
967975
req = Request("http://example.com/", proc.stdout, headers)
@@ -983,8 +991,6 @@ def test_http_body_iterable(self):
983991

984992
def iterable_body():
985993
yield b"one"
986-
yield b"two"
987-
yield b"three"
988994

989995
for headers in {}, {"Content-Length": 11}:
990996
req = Request("http://example.com/", iterable_body(), headers)
@@ -996,6 +1002,14 @@ def iterable_body():
9961002
else:
9971003
self.assertEqual(int(newreq.get_header('Content-length')), 11)
9981004

1005+
def test_http_body_empty_seq(self):
1006+
# Zero-length iterable body should be treated like any other iterable
1007+
h = urllib.request.AbstractHTTPHandler()
1008+
h.parent = MockOpener()
1009+
req = h.do_request_(Request("http://example.com/", ()))
1010+
self.assertEqual(req.get_header("Transfer-encoding"), "chunked")
1011+
self.assertFalse(req.has_header("Content-length"))
1012+
9991013
def test_http_body_array(self):
10001014
# array.array Iterable - Content Length is calculated
10011015

Misc/NEWS

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -52,10 +52,9 @@ Library
5252
- Issue #12319: Chunked transfer encoding support added to
5353
http.client.HTTPConnection requests. The
5454
urllib.request.AbstractHTTPHandler class does not enforce a Content-Length
55-
header any more. If a HTTP request has a non-empty body, but no
56-
Content-Length header, and the content length cannot be determined
57-
up front, rather than throwing an error, the library now falls back
58-
to use chunked transfer encoding.
55+
header any more. If a HTTP request has a file or iterable body, but no
56+
Content-Length header, the library now falls back to use chunked transfer-
57+
encoding.
5958

6059
- A new version of typing.py from https://github.com/python/typing:
6160
- Collection (only for 3.6) (Issue #27598)

0 commit comments

Comments
 (0)