Description
Bug report
Bug description:
The INT
opcode in pickle is the I
character followed by an ASCII number and a newline. There are multiple comments asking if the base should be explicitly set to 10, or kept as 0. However, a discrepancy exists between pickle implementations:
_pickle.c
usesstrtol(s, &endptr, 0);
with a base of 0, meaning0xf
would succeedpickle.py
usesint(data, 0)
with a base of 0, meaning0xf
would succeedpickletools.py
usesread_decimalnl_short()
, which callsint(s)
, meaning any non-decimal base would fail
This same inconsistency exists with the LONG
opcode:
This means an attempt to disassemble a pickle bytestream using pickletools
would fail here, while the actual unpickling process would proceed undisputed.
Personally, I don't really care whether all implementations are changed to base 10 or base 0 (save_long()
only puts it in decimal form), but I think it should be consistent across all implementations. I'd submit a pull request for one way or the other, but I'm not sure which way you'd prefer it.
Also as a note, the pickle bytestream b'I0001\n.'
(INT
with the argument 0001
) fails in pickle.py
because having leading 0s in a number with base 0 causes an error. Note that no errors are thrown in _pickle.c
because it uses strtol
or pickletools.py
because it doesn't have base 0 specified. If we keep the implementation as base 0, that discrepancy between pickle.py
and other pickle implementations would stay, whereas if we change it to base 10 (aka remove base 0), that inconsistency would also go away. For LONG
, both pickle.py
and _pickle.c
fail with b'L0001L\n.'
, but pickletools.py
has no problem displaying that number (since it has no base specified).
CPython versions tested on:
3.11
Operating systems tested on:
Linux
Linked PRs
Metadata
Metadata
Assignees
Labels
Projects
Status