Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 394e55a

Browse files
tiranmiss-islington
authored andcommitted
[3.7] bpo-17239: Disable external entities in SAX parser (GH-9217) (GH-9511)
The SAX parser no longer processes general external entities by default to increase security. Before, the parser created network connections to fetch remote files or loaded local files from the file system for DTD and entities. Signed-off-by: Christian Heimes <[email protected]> https://bugs.python.org/issue17239. (cherry picked from commit 17b1d5d) Co-authored-by: Christian Heimes <[email protected]> https://bugs.python.org/issue17239
1 parent ddc163d commit 394e55a

File tree

9 files changed

+120
-5
lines changed

9 files changed

+120
-5
lines changed

Doc/library/xml.dom.pulldom.rst

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,20 @@ events until either processing is finished or an error condition occurs.
2525
maliciously constructed data. If you need to parse untrusted or
2626
unauthenticated data see :ref:`xml-vulnerabilities`.
2727

28+
.. versionchanged:: 3.7.1
29+
30+
The SAX parser no longer processes general external entities by default to
31+
increase security by default. To enable processing of external entities,
32+
pass a custom parser instance in::
33+
34+
from xml.dom.pulldom import parse
35+
from xml.sax import make_parser
36+
from xml.sax.handler import feature_external_ges
37+
38+
parser = make_parser()
39+
parser.setFeature(feature_external_ges, True)
40+
parse(filename, parser=parser)
41+
2842

2943
Example::
3044

Doc/library/xml.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -65,8 +65,8 @@ kind sax etree minidom p
6565
========================= ============== =============== ============== ============== ==============
6666
billion laughs **Vulnerable** **Vulnerable** **Vulnerable** **Vulnerable** **Vulnerable**
6767
quadratic blowup **Vulnerable** **Vulnerable** **Vulnerable** **Vulnerable** **Vulnerable**
68-
external entity expansion **Vulnerable** Safe (1) Safe (2) **Vulnerable** Safe (3)
69-
`DTD`_ retrieval **Vulnerable** Safe Safe **Vulnerable** Safe
68+
external entity expansion Safe (4) Safe (1) Safe (2) Safe (4) Safe (3)
69+
`DTD`_ retrieval Safe (4) Safe Safe Safe (4) Safe
7070
decompression bomb Safe Safe Safe Safe **Vulnerable**
7171
========================= ============== =============== ============== ============== ==============
7272

@@ -75,6 +75,8 @@ decompression bomb Safe Safe Safe S
7575
2. :mod:`xml.dom.minidom` doesn't expand external entities and simply returns
7676
the unexpanded entity verbatim.
7777
3. :mod:`xmlrpclib` doesn't expand external entities and omits them.
78+
4. Since Python 3.8.0, external general entities are no longer processed by
79+
default since Python.
7880

7981

8082
billion laughs / exponential entity expansion

Doc/library/xml.sax.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,14 @@ the SAX API.
2424
constructed data. If you need to parse untrusted or unauthenticated data see
2525
:ref:`xml-vulnerabilities`.
2626

27+
.. versionchanged:: 3.7.1
28+
29+
The SAX parser no longer processes general external entities by default
30+
to increase security. Before, the parser created network connections
31+
to fetch remote files or loaded local files from the file
32+
system for DTD and entities. The feature can be enabled again with method
33+
:meth:`~xml.sax.xmlreader.XMLReader.setFeature` on the parser object
34+
and argument :data:`~xml.sax.handler.feature_external_ges`.
2735

2836
The convenience functions are:
2937

Doc/whatsnew/3.7.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1573,6 +1573,15 @@ at the interactive prompt. See :ref:`whatsnew37-pep565` for details.
15731573
(Contributed by Nick Coghlan in :issue:`31975`.)
15741574

15751575

1576+
xml
1577+
---
1578+
1579+
As mitigation against DTD and external entity retrieval, the
1580+
:mod:`xml.dom.minidom` and mod:`xml.sax` modules no longer process
1581+
external entities by default.
1582+
(Contributed by Christian Heimes in :issue:`17239`.)
1583+
1584+
15761585
xml.etree
15771586
---------
15781587

@@ -2502,3 +2511,6 @@ calling :c:func:`Py_Initialize`.
25022511
In 3.7.1 the C API for Context Variables
25032512
:ref:`was updated <contextvarsobjects_pointertype_change>` to use
25042513
:c:type:`PyObject` pointers. See also :issue:`34762`.
2514+
2515+
:mod:`xml.dom.minidom` and mod:`xml.sax` modules no longer process
2516+
external entities by default. See also :issue:`17239`.

Lib/test/test_pulldom.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
import xml.sax
44

55
from xml.sax.xmlreader import AttributesImpl
6+
from xml.sax.handler import feature_external_ges
67
from xml.dom import pulldom
78

89
from test.support import findfile
@@ -159,6 +160,12 @@ def test_end_document(self):
159160
self.fail(
160161
"Ran out of events, but should have received END_DOCUMENT")
161162

163+
def test_external_ges_default(self):
164+
parser = pulldom.parseString(SMALL_SAMPLE)
165+
saxparser = parser.parser
166+
ges = saxparser.getFeature(feature_external_ges)
167+
self.assertEqual(ges, False)
168+
162169

163170
class ThoroughTestCase(unittest.TestCase):
164171
"""Test the hard-to-reach parts of pulldom."""

Lib/test/test_sax.py

Lines changed: 58 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,14 @@
1313
from xml.sax.saxutils import XMLGenerator, escape, unescape, quoteattr, \
1414
XMLFilterBase, prepare_input_source
1515
from xml.sax.expatreader import create_parser
16-
from xml.sax.handler import feature_namespaces
16+
from xml.sax.handler import feature_namespaces, feature_external_ges
1717
from xml.sax.xmlreader import InputSource, AttributesImpl, AttributesNSImpl
1818
from io import BytesIO, StringIO
1919
import codecs
2020
import gc
2121
import os.path
2222
import shutil
23+
from urllib.error import URLError
2324
from test import support
2425
from test.support import findfile, run_unittest, TESTFN
2526

@@ -911,6 +912,18 @@ def notationDecl(self, name, publicId, systemId):
911912
def unparsedEntityDecl(self, name, publicId, systemId, ndata):
912913
self._entities.append((name, publicId, systemId, ndata))
913914

915+
916+
class TestEntityRecorder:
917+
def __init__(self):
918+
self.entities = []
919+
920+
def resolveEntity(self, publicId, systemId):
921+
self.entities.append((publicId, systemId))
922+
source = InputSource()
923+
source.setPublicId(publicId)
924+
source.setSystemId(systemId)
925+
return source
926+
914927
def test_expat_dtdhandler(self):
915928
parser = create_parser()
916929
handler = self.TestDTDHandler()
@@ -927,6 +940,32 @@ def test_expat_dtdhandler(self):
927940
[("GIF", "-//CompuServe//NOTATION Graphics Interchange Format 89a//EN", None)])
928941
self.assertEqual(handler._entities, [("img", None, "expat.gif", "GIF")])
929942

943+
def test_expat_external_dtd_enabled(self):
944+
parser = create_parser()
945+
parser.setFeature(feature_external_ges, True)
946+
resolver = self.TestEntityRecorder()
947+
parser.setEntityResolver(resolver)
948+
949+
with self.assertRaises(URLError):
950+
parser.feed(
951+
'<!DOCTYPE external SYSTEM "unsupported://non-existing">\n'
952+
)
953+
self.assertEqual(
954+
resolver.entities, [(None, 'unsupported://non-existing')]
955+
)
956+
957+
def test_expat_external_dtd_default(self):
958+
parser = create_parser()
959+
resolver = self.TestEntityRecorder()
960+
parser.setEntityResolver(resolver)
961+
962+
parser.feed(
963+
'<!DOCTYPE external SYSTEM "unsupported://non-existing">\n'
964+
)
965+
parser.feed('<doc />')
966+
parser.close()
967+
self.assertEqual(resolver.entities, [])
968+
930969
# ===== EntityResolver support
931970

932971
class TestEntityResolver:
@@ -936,8 +975,9 @@ def resolveEntity(self, publicId, systemId):
936975
inpsrc.setByteStream(BytesIO(b"<entity/>"))
937976
return inpsrc
938977

939-
def test_expat_entityresolver(self):
978+
def test_expat_entityresolver_enabled(self):
940979
parser = create_parser()
980+
parser.setFeature(feature_external_ges, True)
941981
parser.setEntityResolver(self.TestEntityResolver())
942982
result = BytesIO()
943983
parser.setContentHandler(XMLGenerator(result))
@@ -951,6 +991,22 @@ def test_expat_entityresolver(self):
951991
self.assertEqual(result.getvalue(), start +
952992
b"<doc><entity></entity></doc>")
953993

994+
def test_expat_entityresolver_default(self):
995+
parser = create_parser()
996+
self.assertEqual(parser.getFeature(feature_external_ges), False)
997+
parser.setEntityResolver(self.TestEntityResolver())
998+
result = BytesIO()
999+
parser.setContentHandler(XMLGenerator(result))
1000+
1001+
parser.feed('<!DOCTYPE doc [\n')
1002+
parser.feed(' <!ENTITY test SYSTEM "whatever">\n')
1003+
parser.feed(']>\n')
1004+
parser.feed('<doc>&test;</doc>')
1005+
parser.close()
1006+
1007+
self.assertEqual(result.getvalue(), start +
1008+
b"<doc></doc>")
1009+
9541010
# ===== Attributes support
9551011

9561012
class AttrGatherer(ContentHandler):

Lib/test/test_xml_etree.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,12 @@
9191
<document>&entity;</document>
9292
"""
9393

94+
EXTERNAL_ENTITY_XML = """\
95+
<!DOCTYPE points [
96+
<!ENTITY entity SYSTEM "file:///non-existing-file.xml">
97+
]>
98+
<document>&entity;</document>
99+
"""
94100

95101
def checkwarnings(*filters, quiet=False):
96102
def decorator(test):
@@ -861,6 +867,13 @@ def test_entity(self):
861867
root = parser.close()
862868
self.serialize_check(root, '<document>text</document>')
863869

870+
# 4) external (SYSTEM) entity
871+
872+
with self.assertRaises(ET.ParseError) as cm:
873+
ET.XML(EXTERNAL_ENTITY_XML)
874+
self.assertEqual(str(cm.exception),
875+
'undefined entity &entity;: line 4, column 10')
876+
864877
def test_namespace(self):
865878
# Test namespace issues.
866879

Lib/xml/sax/expatreader.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ def __init__(self, namespaceHandling=0, bufsize=2**16-20):
9595
self._lex_handler_prop = None
9696
self._parsing = 0
9797
self._entity_stack = []
98-
self._external_ges = 1
98+
self._external_ges = 0
9999
self._interning = None
100100

101101
# XMLReader methods
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
The xml.sax and xml.dom.minidom parsers no longer processes external
2+
entities by default. External DTD and ENTITY declarations no longer
3+
load files or create network connections.

0 commit comments

Comments
 (0)