Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 372ac5e

Browse files
author
Victor Stinner
committed
PyObject_Dump() encodes unicode objects to utf8 with backslashreplace (instead
of strict) error handler to escape surrogates
1 parent 6baded4 commit 372ac5e

3 files changed

Lines changed: 16 additions & 1 deletion

File tree

Lib/test/test_sys.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -145,6 +145,16 @@ def test_exit(self):
145145
"raise SystemExit(47)"])
146146
self.assertEqual(rc, 47)
147147

148+
# test that the exit message is written with backslashreplace error
149+
# handler to stderr
150+
import subprocess
151+
code = r'import sys; sys.exit("surrogates:\uDCFF")'
152+
process = subprocess.Popen([sys.executable, "-c", code],
153+
stderr=subprocess.PIPE)
154+
stdout, stderr = process.communicate()
155+
self.assertEqual(process.returncode, 1)
156+
self.assertTrue(stderr.startswith(b"surrogates:\\udcff"), stderr)
157+
148158
def test_getdefaultencoding(self):
149159
self.assertRaises(TypeError, sys.getdefaultencoding, 42)
150160
# can't check more than the type, as the user might have changed it

Misc/NEWS

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,9 @@ What's New in Python 3.2 Alpha 1?
1212
Core and Builtins
1313
-----------------
1414

15+
- PyObject_Dump() encodes unicode objects to utf8 with backslashreplace
16+
(instead of strict) error handler to escape surrogates
17+
1518
- Issue #8715: Create PyUnicode_EncodeFSDefault() function: Encode a Unicode
1619
object to Py_FileSystemDefaultEncoding with the "surrogateescape" error
1720
handler, and return bytes. If Py_FileSystemDefaultEncoding is not set, fall

Objects/object.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -303,7 +303,9 @@ internal_print(PyObject *op, FILE *fp, int flags, int nesting)
303303
}
304304
else if (PyUnicode_Check(s)) {
305305
PyObject *t;
306-
t = _PyUnicode_AsDefaultEncodedString(s, NULL);
306+
t = PyUnicode_EncodeUTF8(PyUnicode_AS_UNICODE(s),
307+
PyUnicode_GET_SIZE(s),
308+
"backslashreplace");
307309
if (t == NULL)
308310
ret = 0;
309311
else {

0 commit comments

Comments
 (0)