Thanks to visit codestin.com
Credit goes to github.com

Skip to content

bpo-37751: Fix normalizestring() with hyphens and spaces converted to underscores #15092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Aug 21, 2019
Merged
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix :func:`codecs.lookup` to normalize the encoding name the same way than :func:`encodings.normalize_encoding`, except that :func:`codecs.lookup` also converts the name to lower case.
32 changes: 16 additions & 16 deletions Python/codecs.c
Original file line number Diff line number Diff line change
Expand Up @@ -49,36 +49,36 @@ int PyCodec_Register(PyObject *search_function)
return -1;
}

/* Convert a string to a normalized Python string: all characters are
converted to lower case, spaces are replaced with underscores. */
extern int _Py_normalize_encoding(const char *, char *, size_t);

/* Convert a string to a normalized Python string(decoded from UTF-8): all characters are
converted to lower case, spaces and hyphens are replaced with underscores. */

static
PyObject *normalizestring(const char *string)
{
size_t i;
size_t len = strlen(string);
char *p;
char *encoding;
PyObject *v;

if (len > PY_SSIZE_T_MAX) {
PyErr_SetString(PyExc_OverflowError, "string is too large");
return NULL;
}

p = PyMem_Malloc(len + 1);
if (p == NULL)
encoding = PyMem_Malloc(len + 1);
if (encoding == NULL)
return PyErr_NoMemory();
for (i = 0; i < len; i++) {
char ch = string[i];
if (ch == ' ')
ch = '-';
else
ch = Py_TOLOWER(Py_CHARMASK(ch));
p[i] = ch;

if (!_Py_normalize_encoding(string, encoding, len + 1))
{
PyErr_SetString(PyExc_RuntimeError, "_Py_normalize_encoding() failed");
PyMem_Free(encoding);
return NULL;
}
p[i] = '\0';
v = PyUnicode_FromString(p);
PyMem_Free(p);

v = PyUnicode_FromString(encoding);
PyMem_Free(encoding);
return v;
}

Expand Down