gh-142037: Improve error messages for printf-style formatting#142081
gh-142037: Improve error messages for printf-style formatting#142081serhiy-storchaka merged 9 commits intopython:mainfrom
Conversation
This affects string formatting as well as bytes and bytearray formatting. * For errors in the format string, always include the position of the start of the format unit. * For errors related to the formatted arguments, always include the number or the name of the argument. * Suggest more probable causes of errors in the format string (stray %, unsupported format, unexpected character). * Provide more information when the number of arguments does not match the number of format units. * Raise more specific errors when access of arguments by name is mixed with sequential access and if * is used with a mapping. * Add tests for some uncovered cases.
Lib/test/test_format.py
Outdated
| test_exc_common("abc %Id", 1, ValueError, | ||
| "unsupported format %I at position 4") | ||
| test_exc_common("abc %'d", 1, ValueError, | ||
| "stray % at position 4 or unexpected format character ''' at position 5") |
There was a problem hiding this comment.
''' is not very readable. Would it be possible to format it as U+HHHH or 0xHH?
Lib/test/test_format.py
Outdated
| test_exc_common('%(x)r', 1, TypeError, | ||
| "format requires a mapping, not int") | ||
| test_exc_common('%*r', 3.14, TypeError, | ||
| "* requires int, not float") |
There was a problem hiding this comment.
Without context, it's uneasy to understand that the error comes from a format string. Maybe write %* instead?
| "* requires int, not float") | |
| "%* requires int, not float") |
There was a problem hiding this comment.
It is perhaps better to remove this test, because the example is incorrect in more than one way. If * is used, then we need more than one argument, and in that case we will get "format argument N" in the error message, like in the following line..
Lib/test/test_format.py
Outdated
| "format argument 1: too big for precision") | ||
| test_exc_common('%d', '1', TypeError, | ||
| "%d format: a real number is required, not str") | ||
| "a real number is required for format %d, not str") |
There was a problem hiding this comment.
is it really a real number which is expected, or an integer?
There was a problem hiding this comment.
Yes. It will be truncated to integer.
Lib/test/test_format.py
Outdated
| test_exc('%c', 2**128, OverflowError, | ||
| "%c argument not in range(0x110000)") | ||
| test_exc('%c', 3.14, TypeError, | ||
| "%c requires an integer or a unicode character, not float") |
There was a problem hiding this comment.
| "%c requires an integer or a unicode character, not float") | |
| "%c requires an integer or a Unicode character, not float") |
There was a problem hiding this comment.
This is used in many other places. We should do all such changes at once -- either capitalize "unicode", or remove it.
Objects/unicode_format.c
Outdated
| (int)arg->ch, arg->fmtstart); | ||
| } | ||
| else if (arg->ch >= 32 && arg->ch < 127) { | ||
| else if (arg->ch >= 32 && arg->ch < 127 && arg->ch != '\'') { |
There was a problem hiding this comment.
Maybe ignore also the quote character in Py_UNICODE_ISPRINTABLE() test below?
0e43000 to
c015d6a
Compare
…on into str-format-errors
serhiy-storchaka
left a comment
There was a problem hiding this comment.
I currently consider an idea of using "%x expected an integer, got float" instead of "%x requires an integer, not float". What do you think?
vstinner
left a comment
There was a problem hiding this comment.
LGTM. I left minor comments.
| "format argument: %c requires an integer or a unicode character, not float") | ||
| test_exc('%c', (3.14,), TypeError, | ||
| "format argument 1: %c requires an integer or a unicode character, not float") | ||
| test_exc('%(x)c', {'x': 3.14}, TypeError, | ||
| "format argument 'x': %c requires an integer or a unicode character, not float") | ||
| test_exc('%c', 'ab', TypeError, | ||
| "format argument: %c requires an integer or a unicode character, not a string of length 2") | ||
| test_exc('%c', ('ab',), TypeError, | ||
| "format argument 1: %c requires an integer or a unicode character, not a string of length 2") | ||
| test_exc('%(x)c', {'x': 'ab'}, TypeError, | ||
| "format argument 'x': %c requires an integer or a unicode character, not a string of length 2") | ||
| test_exc('%c', b'x', TypeError, | ||
| "format argument: %c requires an integer or a unicode character, not bytes") |
There was a problem hiding this comment.
Since we are changing error messages, I suggest to write Unicode with an uppercase U.
There was a problem hiding this comment.
This is a different issue. "unicode character" is used in other error messages which this error message was based on.
Lib/test/test_str.py
Outdated
| self.assertRaisesRegex(TypeError, '%i format: a real number is required, not complex', operator.mod, '%i', 2j) | ||
| self.assertRaisesRegex(TypeError, '%d format: a real number is required, not complex', operator.mod, '%d', 1j) | ||
| self.assertRaisesRegex(TypeError, r'%c requires an int or a unicode character, not .*\.PseudoFloat', operator.mod, '%c', pi) | ||
| self.assertRaisesRegex(TypeError, '%x requires an integer, not float', operator.mod, '%x', 3.14) |
There was a problem hiding this comment.
I would prefer to check also the format argument: prefix.
There was a problem hiding this comment.
I am not sure that it is related to the purpose of this test, but I'll do this. Although it make the lines obscenely long.
This affects string formatting as well as bytes and bytearray formatting.
%-formats #142037