Description
Bug report
Bug description:
The implementation of _pyrepl.input.KeymapTranslator
has a check for input with Unicode category "C"
. This code has been present since the initial commit of the new REPL in #111567. However, no such category is ever returned by unicodedata
because there is no "C"
entry in the list of category names¹
>>> any(unicodedata.category(chr(n)) == "C" for n in range(sys.maxunicode)) # Python 3.12
False
I'm not familiar enough with _pyrepl
to know what the implications of this always-false predicate are, but I do know that the block in question is effectively dead code because of it.
I think this is meant to be a .startswith()
check for the Other
category identified by UAX #44, i.e. the union of Cc | Cf | Cs | Co | Cn
, in line with other usage in _pyrepl.reader
. I'll open a PR for that.
¹ the list of category names is hardcoded in makeunicodedata.py
rather than derived from UCD, which does define C = Cc | Cf | Cs | Co | Cn
in PropertyValueAliases.txt
. I don't think there's any version of the unicodedata
API that would support returning "C"
here, though. Just being a little obsessive.
CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux