Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 8e08ac9

Browse files
barneygalepicnixzAA-Turnerzooba
authored
GH-123599: url2pathname(): don't call gethostbyname() by default (#132610)
Follow-up to 66cdb2b. Add *resolve_host* keyword-only argument to `url2pathname()`, defaulting to false. When set to true, we call `socket.gethostbyname()` to resolve the URL hostname. Co-authored-by: Bénédikt Tran <[email protected]> Co-authored-by: Adam Turner <[email protected]> Co-authored-by: Steve Dower <[email protected]>
1 parent 082dbf7 commit 8e08ac9

File tree

7 files changed

+48
-30
lines changed

7 files changed

+48
-30
lines changed

Doc/library/pathlib.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -872,10 +872,10 @@ conforming to :rfc:`8089`.
872872
.. versionadded:: 3.13
873873

874874
.. versionchanged:: next
875-
If a URL authority (e.g. a hostname) is present and resolves to a local
876-
address, it is discarded. If an authority is present and *doesn't*
877-
resolve to a local address, then on Windows a UNC path is returned (as
878-
before), and on other platforms a :exc:`ValueError` is raised.
875+
The URL authority is discarded if it matches the local hostname.
876+
Otherwise, if the authority isn't empty or ``localhost``, then on
877+
Windows a UNC path is returned (as before), and on other platforms a
878+
:exc:`ValueError` is raised.
879879

880880

881881
.. method:: Path.as_uri()

Doc/library/urllib.request.rst

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -172,10 +172,10 @@ The :mod:`urllib.request` module defines the following functions:
172172
the URL ``///etc/hosts``.
173173

174174
.. versionchanged:: next
175-
The *add_scheme* argument was added.
175+
The *add_scheme* parameter was added.
176176

177177

178-
.. function:: url2pathname(url, *, require_scheme=False)
178+
.. function:: url2pathname(url, *, require_scheme=False, resolve_host=False)
179179

180180
Convert the given ``file:`` URL to a local path. This function uses
181181
:func:`~urllib.parse.unquote` to decode the URL.
@@ -185,6 +185,13 @@ The :mod:`urllib.request` module defines the following functions:
185185
value should include the prefix; a :exc:`~urllib.error.URLError` is raised
186186
if it doesn't.
187187

188+
The URL authority is discarded if it is empty, ``localhost``, or the local
189+
hostname. Otherwise, if *resolve_host* is set to true, the authority is
190+
resolved using :func:`socket.gethostbyname` and discarded if it matches a
191+
local IP address (as per :rfc:`RFC 8089 §3 <8089#section-3>`). If the
192+
authority is still unhandled, then on Windows a UNC path is returned, and
193+
on other platforms a :exc:`~urllib.error.URLError` is raised.
194+
188195
This example shows the function being used on Windows::
189196

190197
>>> from urllib.request import url2pathname
@@ -198,14 +205,13 @@ The :mod:`urllib.request` module defines the following functions:
198205
:exc:`OSError` exception to be raised on Windows.
199206

200207
.. versionchanged:: next
201-
This function calls :func:`socket.gethostbyname` if the URL authority
202-
isn't empty, ``localhost``, or the machine hostname. If the authority
203-
resolves to a local IP address then it is discarded; otherwise, on
208+
The URL authority is discarded if it matches the local hostname.
209+
Otherwise, if the authority isn't empty or ``localhost``, then on
204210
Windows a UNC path is returned (as before), and on other platforms a
205211
:exc:`~urllib.error.URLError` is raised.
206212

207213
.. versionchanged:: next
208-
The *require_scheme* argument was added.
214+
The *require_scheme* and *resolve_host* parameters were added.
209215

210216

211217
.. function:: getproxies()

Doc/whatsnew/3.14.rst

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1703,9 +1703,11 @@ urllib
17031703

17041704
- Accept a complete URL when the new *require_scheme* argument is set to
17051705
true.
1706-
- Discard URL authorities that resolve to a local IP address.
1707-
- Raise :exc:`~urllib.error.URLError` if a URL authority doesn't resolve
1708-
to a local IP address, except on Windows where we return a UNC path.
1706+
- Discard URL authority if it matches the local hostname.
1707+
- Discard URL authority if it resolves to a local IP address when the new
1708+
*resolve_host* argument is set to true.
1709+
- Raise :exc:`~urllib.error.URLError` if a URL authority isn't local,
1710+
except on Windows where we return a UNC path as before.
17091711

17101712
In :func:`urllib.request.pathname2url`:
17111713

Lib/test/test_pathlib/test_pathlib.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3290,7 +3290,6 @@ def test_from_uri_posix(self):
32903290
self.assertEqual(P.from_uri('file:////foo/bar'), P('//foo/bar'))
32913291
self.assertEqual(P.from_uri('file://localhost/foo/bar'), P('/foo/bar'))
32923292
if not is_wasi:
3293-
self.assertEqual(P.from_uri('file://127.0.0.1/foo/bar'), P('/foo/bar'))
32943293
self.assertEqual(P.from_uri(f'file://{socket.gethostname()}/foo/bar'),
32953294
P('/foo/bar'))
32963295
self.assertRaises(ValueError, P.from_uri, 'foo/bar')

Lib/test/test_urllib.py

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1551,7 +1551,8 @@ def test_url2pathname_require_scheme(self):
15511551
urllib.request.url2pathname(url, require_scheme=True),
15521552
expected_path)
15531553

1554-
error_subtests = [
1554+
def test_url2pathname_require_scheme_errors(self):
1555+
subtests = [
15551556
'',
15561557
':',
15571558
'foo',
@@ -1561,13 +1562,20 @@ def test_url2pathname_require_scheme(self):
15611562
'data:file:foo',
15621563
'data:file://foo',
15631564
]
1564-
for url in error_subtests:
1565+
for url in subtests:
15651566
with self.subTest(url=url):
15661567
self.assertRaises(
15671568
urllib.error.URLError,
15681569
urllib.request.url2pathname,
15691570
url, require_scheme=True)
15701571

1572+
def test_url2pathname_resolve_host(self):
1573+
fn = urllib.request.url2pathname
1574+
sep = os.path.sep
1575+
self.assertEqual(fn('//127.0.0.1/foo/bar', resolve_host=True), f'{sep}foo{sep}bar')
1576+
self.assertEqual(fn(f'//{socket.gethostname()}/foo/bar'), f'{sep}foo{sep}bar')
1577+
self.assertEqual(fn(f'//{socket.gethostname()}/foo/bar', resolve_host=True), f'{sep}foo{sep}bar')
1578+
15711579
@unittest.skipUnless(sys.platform == 'win32',
15721580
'test specific to Windows pathnames.')
15731581
def test_url2pathname_win(self):
@@ -1598,6 +1606,7 @@ def test_url2pathname_win(self):
15981606
self.assertEqual(fn('//server/path/to/file'), '\\\\server\\path\\to\\file')
15991607
self.assertEqual(fn('////server/path/to/file'), '\\\\server\\path\\to\\file')
16001608
self.assertEqual(fn('/////server/path/to/file'), '\\\\server\\path\\to\\file')
1609+
self.assertEqual(fn('//127.0.0.1/path/to/file'), '\\\\127.0.0.1\\path\\to\\file')
16011610
# Localhost paths
16021611
self.assertEqual(fn('//localhost/C:/path/to/file'), 'C:\\path\\to\\file')
16031612
self.assertEqual(fn('//localhost/C|/path/to/file'), 'C:\\path\\to\\file')
@@ -1622,8 +1631,7 @@ def test_url2pathname_posix(self):
16221631
self.assertRaises(urllib.error.URLError, fn, '//:80/foo/bar')
16231632
self.assertRaises(urllib.error.URLError, fn, '//:/foo/bar')
16241633
self.assertRaises(urllib.error.URLError, fn, '//c:80/foo/bar')
1625-
self.assertEqual(fn('//127.0.0.1/foo/bar'), '/foo/bar')
1626-
self.assertEqual(fn(f'//{socket.gethostname()}/foo/bar'), '/foo/bar')
1634+
self.assertRaises(urllib.error.URLError, fn, '//127.0.0.1/foo/bar')
16271635

16281636
@unittest.skipUnless(os_helper.FS_NONASCII, 'need os_helper.FS_NONASCII')
16291637
def test_url2pathname_nonascii(self):

Lib/urllib/request.py

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1466,7 +1466,7 @@ def get_names(self):
14661466
def open_local_file(self, req):
14671467
import email.utils
14681468
import mimetypes
1469-
localfile = url2pathname(req.full_url, require_scheme=True)
1469+
localfile = url2pathname(req.full_url, require_scheme=True, resolve_host=True)
14701470
try:
14711471
stats = os.stat(localfile)
14721472
size = stats.st_size
@@ -1482,7 +1482,7 @@ def open_local_file(self, req):
14821482

14831483
file_open = open_local_file
14841484

1485-
def _is_local_authority(authority):
1485+
def _is_local_authority(authority, resolve):
14861486
# Compare hostnames
14871487
if not authority or authority == 'localhost':
14881488
return True
@@ -1494,9 +1494,11 @@ def _is_local_authority(authority):
14941494
if authority == hostname:
14951495
return True
14961496
# Compare IP addresses
1497+
if not resolve:
1498+
return False
14971499
try:
14981500
address = socket.gethostbyname(authority)
1499-
except (socket.gaierror, AttributeError):
1501+
except (socket.gaierror, AttributeError, UnicodeEncodeError):
15001502
return False
15011503
return address in FileHandler().get_names()
15021504

@@ -1641,21 +1643,24 @@ def data_open(self, req):
16411643
return addinfourl(io.BytesIO(data), headers, url)
16421644

16431645

1644-
# Code move from the old urllib module
1646+
# Code moved from the old urllib module
16451647

1646-
def url2pathname(url, *, require_scheme=False):
1648+
def url2pathname(url, *, require_scheme=False, resolve_host=False):
16471649
"""Convert the given file URL to a local file system path.
16481650
16491651
The 'file:' scheme prefix must be omitted unless *require_scheme*
16501652
is set to true.
1653+
1654+
The URL authority may be resolved with gethostbyname() if
1655+
*resolve_host* is set to true.
16511656
"""
16521657
if require_scheme:
16531658
scheme, url = _splittype(url)
16541659
if scheme != 'file':
16551660
raise URLError("URL is missing a 'file:' scheme")
16561661
authority, url = _splithost(url)
16571662
if os.name == 'nt':
1658-
if not _is_local_authority(authority):
1663+
if not _is_local_authority(authority, resolve_host):
16591664
# e.g. file://server/share/file.txt
16601665
url = '//' + authority + url
16611666
elif url[:3] == '///':
@@ -1669,7 +1674,7 @@ def url2pathname(url, *, require_scheme=False):
16691674
# Older URLs use a pipe after a drive letter
16701675
url = url[:1] + ':' + url[2:]
16711676
url = url.replace('/', '\\')
1672-
elif not _is_local_authority(authority):
1677+
elif not _is_local_authority(authority, resolve_host):
16731678
raise URLError("file:// scheme is supported only on localhost")
16741679
encoding = sys.getfilesystemencoding()
16751680
errors = sys.getfilesystemencodeerrors()
Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
Fix issue where :func:`urllib.request.url2pathname` mishandled file URLs with
2-
authorities. If an authority is present and resolves to ``localhost``, it is
3-
now discarded. If an authority is present but *doesn't* resolve to
4-
``localhost``, then on Windows a UNC path is returned (as before), and on
5-
other platforms a :exc:`urllib.error.URLError` is now raised.
1+
Add *resolve_host* keyword-only parameter to
2+
:func:`urllib.request.url2pathname`, and fix handling of file URLs with
3+
authorities.

0 commit comments

Comments
 (0)