22=============================================================
33
44.. module :: urllib.request
5- :synopsis: Next generation URL opening library .
5+ :synopsis: Extensible library for opening URLs .
66..
moduleauthor ::
Jeremy Hylton <[email protected] > 77..
sectionauthor ::
Moshe Zadka <[email protected] > 8+ ..
sectionauthor ::
Senthil Kumaran <[email protected] > 89
910
1011The :mod: `urllib.request ` module defines functions and classes which help in
@@ -20,16 +21,26 @@ The :mod:`urllib.request` module defines the following functions:
2021 Open the URL *url *, which can be either a string or a
2122 :class: `Request ` object.
2223
23- *data * may be a bytes object specifying additional data to send to the
24+ *data * must be a bytes object specifying additional data to be sent to the
2425 server, or ``None `` if no such data is needed. *data * may also be an
2526 iterable object and in that case Content-Length value must be specified in
2627 the headers. Currently HTTP requests are the only ones that use *data *; the
2728 HTTP request will be a POST instead of a GET when the *data * parameter is
28- provided. *data * should be a buffer in the standard
29+ provided.
30+
31+ *data * should be a buffer in the standard
2932 :mimetype: `application/x-www-form-urlencoded ` format. The
3033 :func: `urllib.parse.urlencode ` function takes a mapping or sequence of
31- 2-tuples and returns a string in this format. urllib.request module uses
32- HTTP/1.1 and includes ``Connection:close `` header in its HTTP requests.
34+ 2-tuples and returns a string in this format. It should be encoded to bytes
35+ before being used as the *data * parameter. The charset parameter in
36+ ``Content-Type `` header may be used to specify the encoding. If charset
37+ parameter is not sent with the Content-Type header, the server following the
38+ HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1
39+ encoding. It is advisable to use charset parameter with encoding used in
40+ ``Content-Type `` header with the :class: `Request `.
41+
42+ urllib.request module uses HTTP/1.1 and includes ``Connection:close `` header
43+ in its HTTP requests.
3344
3445 The optional *timeout * parameter specifies a timeout in seconds for
3546 blocking operations like the connection attempt (if not specified,
@@ -66,9 +77,10 @@ The :mod:`urllib.request` module defines the following functions:
6677 are handled through the proxy when they are set.
6778
6879 The legacy ``urllib.urlopen `` function from Python 2.6 and earlier has been
69- discontinued; :func: `urlopen ` corresponds to the old ``urllib2.urlopen ``.
70- Proxy handling, which was done by passing a dictionary parameter to
71- ``urllib.urlopen ``, can be obtained by using :class: `ProxyHandler ` objects.
80+ discontinued; :func: `urllib.request.urlopen ` corresponds to the old
81+ ``urllib2.urlopen ``. Proxy handling, which was done by passing a dictionary
82+ parameter to ``urllib.urlopen ``, can be obtained by using
83+ :class: `ProxyHandler ` objects.
7284
7385 .. versionchanged :: 3.2
7486 *cafile * and *capath * were added.
@@ -83,10 +95,11 @@ The :mod:`urllib.request` module defines the following functions:
8395.. function :: install_opener(opener)
8496
8597 Install an :class: `OpenerDirector ` instance as the default global opener.
86- Installing an opener is only necessary if you want urlopen to use that opener;
87- otherwise, simply call :meth: `OpenerDirector.open ` instead of :func: `urlopen `.
88- The code does not check for a real :class: `OpenerDirector `, and any class with
89- the appropriate interface will work.
98+ Installing an opener is only necessary if you want urlopen to use that
99+ opener; otherwise, simply call :meth: `OpenerDirector.open ` instead of
100+ :func: `~urllib.request.urlopen `. The code does not check for a real
101+ :class: `OpenerDirector `, and any class with the appropriate interface will
102+ work.
90103
91104
92105.. function :: build_opener([handler, ...])
@@ -138,13 +151,21 @@ The following classes are provided:
138151
139152 *url * should be a string containing a valid URL.
140153
141- *data * may be a bytes object specifying additional data to send to the
154+ *data * must be a bytes object specifying additional data to send to the
142155 server, or ``None `` if no such data is needed. Currently HTTP requests are
143156 the only ones that use *data *; the HTTP request will be a POST instead of a
144157 GET when the *data * parameter is provided. *data * should be a buffer in the
145- standard :mimetype: `application/x-www-form-urlencoded ` format. The
146- :func: `urllib.parse.urlencode ` function takes a mapping or sequence of
147- 2-tuples and returns a string in this format.
158+ standard :mimetype: `application/x-www-form-urlencoded ` format.
159+
160+ The :func: `urllib.parse.urlencode ` function takes a mapping or sequence of
161+ 2-tuples and returns a string in this format. It should be encoded to bytes
162+ before being used as the *data * parameter. The charset parameter in
163+ ``Content-Type `` header may be used to specify the encoding. If charset
164+ parameter is not sent with the Content-Type header, the server following the
165+ HTTP 1.1 recommendation may assume that the data is encoded in ISO-8859-1
166+ encoding. It is advisable to use charset parameter with encoding used in
167+ ``Content-Type `` header with the :class: `Request `.
168+
148169
149170 *headers * should be a dictionary, and will be treated as if
150171 :meth: `add_header ` was called with each key and value as arguments.
@@ -156,8 +177,11 @@ The following classes are provided:
156177 :mod: `urllib `'s default user agent string is
157178 ``"Python-urllib/2.6" `` (on Python 2.6).
158179
159- The following two arguments, *origin_req_host * and *unverifiable *,
160- are only of interest for correct handling of third-party HTTP cookies:
180+ An example of using ``Content-Type `` header with *data * argument would be
181+ sending a dictionary like ``{"Content-Type":" application/x-www-form-urlencoded;charset=utf-8"} ``
182+
183+ The final two arguments are only of interest for correct handling
184+ of third-party HTTP cookies:
161185
162186 *origin_req_host * should be the request-host of the origin
163187 transaction, as defined by :rfc: `2965 `. It defaults to
@@ -1107,8 +1131,9 @@ every :class:`Request`. To change this::
11071131 opener.open('http://www.example.com/')
11081132
11091133Also, remember that a few standard headers (:mailheader: `Content-Length `,
1110- :mailheader: `Content-Type ` and :mailheader: `Host `) are added when the
1111- :class: `Request ` is passed to :func: `urlopen ` (or :meth: `OpenerDirector.open `).
1134+ :mailheader: `Content-Type ` without charset parameter and :mailheader: `Host `)
1135+ are added when the :class: `Request ` is passed to :func: `urlopen ` (or
1136+ :meth: `OpenerDirector.open `).
11121137
11131138.. _urllib-examples :
11141139
@@ -1126,9 +1151,12 @@ from urlencode is encoded to bytes before it is sent to urlopen as data::
11261151
11271152 >>> import urllib.request
11281153 >>> import urllib.parse
1129- >>> params = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
1130- >>> params = params.encode('utf-8')
1131- >>> f = urllib.request.urlopen("http://www.musi-cal.com/cgi-bin/query", params)
1154+ >>> data = urllib.parse.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0})
1155+ >>> data = data.encode('utf-8')
1156+ >>> request = urllib.request.Request("http://requestb.in/xrbl82xr")
1157+ >>> # adding charset parameter to the Content-Type header.
1158+ >>> request.add_header("Content-Type","application/x-www-form-urlencoded;charset=utf-8")
1159+ >>> f = urllib.request.urlopen(request, data)
11321160 >>> print(f.read().decode('utf-8'))
11331161
11341162The following example uses an explicitly specified HTTP proxy, overriding
0 commit comments