-
-
Notifications
You must be signed in to change notification settings - Fork 32k
Replace PyUnicode_CompareWithASCIIString with _PyUnicode_EqualToASCIIString #72887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Proposed patch replaces calls of public function PyUnicode_CompareWithASCIIString() with new private function _PyUnicode_EqualToASCIIString(). The problem with PyUnicode_CompareWithASCIIString() is that it returns -1 for the result "less than" and error, but the error case is never checked. The patch is purposed for following purposes:
_PyUnicode_EqualToASCIIString() returns true value (1) if strings are equal, false value (0) if they are different, and doesn't raise exceptions. Unlike to PyUnicode_CompareWithASCIIString() it works only with ASCII characters and returns false if any string contains non-ASCII characters. The patch also documents the return value of PyUnicode_CompareWithASCIIString() in case of error. See bpo-21449 for similar issue with _PyUnicode_CompareWithId(). |
LGTM. |
Patch LGTM. |
New changeset 386c682dcd75 by Serhiy Storchaka in branch '3.5': New changeset 72d07d13869a by Serhiy Storchaka in branch '3.6': New changeset 6f0f77333da5 by Serhiy Storchaka in branch 'default': |
Thanks Xiang and Inada for your reviews. The patch fixes a bug: error is not checked after PyUnicode_CompareWithASCIIString(). The patch is not applicable to 2.7 since PyUnicode_CompareWithASCIIString() is Python 3 only. |
(I reopen the issue to ask my question :-)) +/* Test whether a unicode is equal to ASCII string. Return 1 if true, Can you please also document the behaviour if you pass two non-ASCII strings which are equal? I understand that it returns also 0, right? Maybe the API should be more strict and require right to be ASCII: "right string must be encoded to ASCII". I expect an assertion error or a fatal error if right is non-ASCII when Python is compiled in debug mode. |
What mean "equal"? The left argument is a Unicode string, but the right argument is a byte string. For comparing them we should decode right argument or encode left argument. The result depends on using encoding. _PyUnicode_EqualToASCIIString() uses ASCII (as shown from its name). Non-ASCII strings can't be equal. This is documented. If the documentation is not clear, could you provide better wording?
I hesitated about adding an assertion error or a fatal error in a bug fix. But this can be added in develop version. I don't know what is better -- return 0 in all builds or return 0 in release build and crash in debug build? |
I suggest "return 0 in release build and crash in debug build". |
New changeset faf04a995031 by Serhiy Storchaka in branch '3.5': New changeset ff3dacc98b3a by Serhiy Storchaka in branch '3.6': New changeset 765013f71bc4 by Serhiy Storchaka in branch 'default': |
The correct issue for above commits is bpo-21449. |
New changeset b607f835f170 by Serhiy Storchaka in branch '3.5': New changeset 1369e51182b7 by Serhiy Storchaka in branch '3.6': New changeset ba14f8b61bd8 by Serhiy Storchaka in branch 'default': |
Following patch adds checks in debug mode that the right argument of _PyUnicode_EqualToASCIIString and _PyUnicode_EqualToASCIIId is ASCII-only string. |
_PyUnicode_EqualToASCII-runtime-check.diff LGTM. |
New changeset 6dd22ed7140e by Serhiy Storchaka in branch '3.6': New changeset 44874b20e612 by Serhiy Storchaka in branch 'default': |
For the record: This is all that happened in decimal if a) you >>> from decimal import *
>>> import _testcapi
>>> context = Context()
>>> traps = _testcapi.unicode_legacy_string('traps')
>>> getattr(context, traps)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError Both a) and b) are not trivial to accomplish at all and the result |
New changeset 35334a4d41aa by Stefan Krah in branch '3.5': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: