Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit e53d977

Browse files
committed
Explain the use of charset parameter with Content-Type header: issue11082
2 parents df2aecb + 6b3434a commit e53d977

3 files changed

Lines changed: 58 additions & 28 deletions

File tree

Doc/library/urllib.parse.rst

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -512,9 +512,10 @@ task isn't already covered by the URL parsing functions above.
512512

513513
Convert a mapping object or a sequence of two-element tuples, which may
514514
either be a :class:`str` or a :class:`bytes`, to a "percent-encoded"
515-
string. The resultant string must be converted to bytes using the
516-
user-specified encoding before it is sent to :func:`urlopen` as the optional
517-
*data* argument.
515+
string. If the resultant string is to be used as a *data* for POST
516+
operation with :func:`urlopen` function, then it should be properly encoded
517+
to bytes, otherwise it would result in a :exc:`TypeError`.
518+
518519
The resulting string is a series of ``key=value`` pairs separated by ``'&'``
519520
characters, where both *key* and *value* are quoted using :func:`quote_plus`
520521
above. When a sequence of two-element tuples is used as the *query*

Doc/library/urllib.request.rst

Lines changed: 51 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,10 @@
22
=============================================================
33

44
.. module:: urllib.request
5-
:synopsis: Next generation URL opening library.
5+
:synopsis: Extensible library for opening URLs.
66
.. moduleauthor:: Jeremy Hylton <[email protected]>
77
.. sectionauthor:: Moshe Zadka <[email protected]>
8+
.. sectionauthor:: Senthil Kumaran <[email protected]>
89

910

1011
The :mod:`urllib.request` module defines functions and classes which help in
@@ -20,16 +21,26 @@ The :mod:`urllib.request` module defines the following functions:
2021
Open the URL *url*, which can be either a string or a
2122
:class:`Request` object.
2223

23-
*data* may be a bytes object specifying additional data to send to the
24+
*data* must be a bytes object specifying additional data to be sent to the
2425
server, or ``None`` if no such data is needed. *data* may also be an
2526
iterable object and in that case Content-Length value must be specified in
2627
the headers. Currently HTTP requests are the only ones that use *data*; the
2728
HTTP request will be a POST instead of a GET when the *data* parameter is
28-
provided. *data* should be a buffer in the standard
29+
provided.
30+
31+
*data* should be a buffer in the standard
2932
:mimetype:`application/x-www-form-urlencoded` format. The
3033
:func:`urllib.parse.urlencode` function takes a mapping or sequence of
31-
2-tuples and returns a string in this format. urllib.request module uses
32-
HTTP/1.1 and includes ``Connection:close`` header in its HTTP requests.
34+
2-tuples and returns a string in this format. It should be encoded to bytes
35+
before being used as the *data* parameter. The charset parameter in
36+
``Content-Type`` header may be used to specify the encoding. If charset
37+
parameter is not sent with the Content-Type header, the server following the
38+
HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1
39+
encoding. It is advisable to use charset parameter with encoding used in
40+
``Content-Type`` header with the :class:`Request`.
41+
42+
urllib.request module uses HTTP/1.1 and includes ``Connection:close`` header
43+
in its HTTP requests.
3344

3445
The optional *timeout* parameter specifies a timeout in seconds for
3546
blocking operations like the connection attempt (if not specified,
@@ -66,9 +77,10 @@ The :mod:`urllib.request` module defines the following functions:
6677
are handled through the proxy when they are set.
6778

6879
The legacy ``urllib.urlopen`` function from Python 2.6 and earlier has been
69-
discontinued; :func:`urlopen` corresponds to the old ``urllib2.urlopen``.
70-
Proxy handling, which was done by passing a dictionary parameter to
71-
``urllib.urlopen``, can be obtained by using :class:`ProxyHandler` objects.
80+
discontinued; :func:`urllib.request.urlopen` corresponds to the old
81+
``urllib2.urlopen``. Proxy handling, which was done by passing a dictionary
82+
parameter to ``urllib.urlopen``, can be obtained by using
83+
:class:`ProxyHandler` objects.
7284

7385
.. versionchanged:: 3.2
7486
*cafile* and *capath* were added.
@@ -83,10 +95,11 @@ The :mod:`urllib.request` module defines the following functions:
8395
.. function:: install_opener(opener)
8496

8597
Install an :class:`OpenerDirector` instance as the default global opener.
86-
Installing an opener is only necessary if you want urlopen to use that opener;
87-
otherwise, simply call :meth:`OpenerDirector.open` instead of :func:`urlopen`.
88-
The code does not check for a real :class:`OpenerDirector`, and any class with
89-
the appropriate interface will work.
98+
Installing an opener is only necessary if you want urlopen to use that
99+
opener; otherwise, simply call :meth:`OpenerDirector.open` instead of
100+
:func:`~urllib.request.urlopen`. The code does not check for a real
101+
:class:`OpenerDirector`, and any class with the appropriate interface will
102+
work.
90103

91104

92105
.. function:: build_opener([handler, ...])
@@ -138,13 +151,21 @@ The following classes are provided:
138151

139152
*url* should be a string containing a valid URL.
140153

141-
*data* may be a bytes object specifying additional data to send to the
154+
*data* must be a bytes object specifying additional data to send to the
142155
server, or ``None`` if no such data is needed. Currently HTTP requests are
143156
the only ones that use *data*; the HTTP request will be a POST instead of a
144157
GET when the *data* parameter is provided. *data* should be a buffer in the
145-
standard :mimetype:`application/x-www-form-urlencoded` format. The
146-
:func:`urllib.parse.urlencode` function takes a mapping or sequence of
147-
2-tuples and returns a string in this format.
158+
standard :mimetype:`application/x-www-form-urlencoded` format.
159+
160+
The :func:`urllib.parse.urlencode` function takes a mapping or sequence of
161+
2-tuples and returns a string in this format. It should be encoded to bytes
162+
before being used as the *data* parameter. The charset parameter in
163+
``Content-Type`` header may be used to specify the encoding. If charset
164+
parameter is not sent with the Content-Type header, the server following the
165+
HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1
166+
encoding. It is advisable to use charset parameter with encoding used in
167+
``Content-Type`` header with the :class:`Request`.
168+
148169

149170
*headers* should be a dictionary, and will be treated as if
150171
:meth:`add_header` was called with each key and value as arguments.
@@ -156,8 +177,11 @@ The following classes are provided:
156177
:mod:`urllib`'s default user agent string is
157178
``"Python-urllib/2.6"`` (on Python 2.6).
158179

159-
The following two arguments, *origin_req_host* and *unverifiable*,
160-
are only of interest for correct handling of third-party HTTP cookies:
180+
An example of using ``Content-Type`` header with *data* argument would be
181+
sending a dictionary like ``{"Content-Type":" application/x-www-form-urlencoded;charset=utf-8"}``
182+
183+
The final two arguments are only of interest for correct handling
184+
of third-party HTTP cookies:
161185

162186
*origin_req_host* should be the request-host of the origin
163187
transaction, as defined by :rfc:`2965`. It defaults to
@@ -1107,8 +1131,9 @@ every :class:`Request`. To change this::
11071131
opener.open('http://www.example.com/')
11081132

11091133
Also, remember that a few standard headers (:mailheader:`Content-Length`,
1110-
:mailheader:`Content-Type` and :mailheader:`Host`) are added when the
1111-
:class:`Request` is passed to :func:`urlopen` (or :meth:`OpenerDirector.open`).
1134+
:mailheader:`Content-Type` without charset parameter and :mailheader:`Host`)
1135+
are added when the :class:`Request` is passed to :func:`urlopen` (or
1136+
:meth:`OpenerDirector.open`).
11121137

11131138
.. _urllib-examples:
11141139

@@ -1126,9 +1151,12 @@ from urlencode is encoded to bytes before it is sent to urlopen as data::
11261151

11271152
>>> import urllib.request
11281153
>>> import urllib.parse
1129-
>>> params = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
1130-
>>> params = params.encode('utf-8')
1131-
>>> f = urllib.request.urlopen("http://www.musi-cal.com/cgi-bin/query", params)
1154+
>>> data = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
1155+
>>> data = data.encode('utf-8')
1156+
>>> request = urllib.request.Request("http://requestb.in/xrbl82xr")
1157+
>>> # adding charset parameter to the Content-Type header.
1158+
>>> request.add_header("Content-Type","application/x-www-form-urlencoded;charset=utf-8")
1159+
>>> f = urllib.request.urlopen(request, data)
11321160
>>> print(f.read().decode('utf-8'))
11331161

11341162
The following example uses an explicitly specified HTTP proxy, overriding

Lib/urllib/request.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1172,8 +1172,9 @@ def do_request_(self, request):
11721172
if request.data is not None: # POST
11731173
data = request.data
11741174
if isinstance(data, str):
1175-
raise TypeError("POST data should be bytes"
1176-
" or an iterable of bytes. It cannot be str.")
1175+
msg = "POST data should be bytes or an iterable of bytes."\
1176+
"It cannot be str"
1177+
raise TypeError(msg)
11771178
if not request.has_header('Content-type'):
11781179
request.add_unredirected_header(
11791180
'Content-type',

0 commit comments

Comments
 (0)