-
-
Notifications
You must be signed in to change notification settings - Fork 11k
loadtxt() changes numbers if integers are read as strings #17277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@fratajcz, thanks for reporting this problem. This is a bug in how
Here's another example. The last value in the result of
If we skip one row, so all the data to be read is in the first block of 50001 lines, the dtype and the last value are correct:
|
@fratajcz (or anyone else encountering this problem): if you know in advance what the maximum length of your IDs will be, you can work around the bug by giving an explicit length with the dtype, e.g. if the maximum length of an ID is 8, you can use |
Hi @WarrenWeckesser, thanks for the clarification, that explains a lot! Cheers! |
Closes numpy#17277. If loadtxt is passed an unsized string or byte dtype, the size is set automatically from the longest entry in the first 50000 lines. If longer entries appeared later, they were silently truncated.
Closes numpy#17277. If loadtxt is passed an unsized string or byte dtype, the size is set automatically from the longest entry in the first 50000 lines. If longer entries appeared later, they were silently truncated.
Closes numpy#17277. If loadtxt is passed an unsized string or byte dtype, the size is set automatically from the longest entry in the first 50000 lines. If longer entries appeared later, they were silently truncated.
Closes numpy#17277. If loadtxt is passed an unsized string or byte dtype, the size is set automatically from the longest entry in the first 50000 lines. If longer entries appeared later, they were silently truncated.
Closes numpy#17277. If loadtxt is passed an unsized string or byte dtype, the size is set automatically from the longest entry in the first 50000 lines. If longer entries appeared later, they were silently truncated.
Closes numpy#17277. If loadtxt is passed an unsized string or byte dtype, the size is set automatically from the longest entry in the first 50000 lines. If longer entries appeared later, they were silently truncated.
Closes numpy#17277. If loadtxt is passed an unsized string or byte dtype, the size is set automatically from the longest entry in the first 50000 lines. If longer entries appeared later, they were silently truncated.
This was fixed in 1.22 (it is also fixed in the C-parser, but that doesn't really matter). |
Woops, no, not yet fixed probably, that would be gh-19042 |
Fixed by gh-20580 |
Uh oh!
There was an error while loading. Please reload this page.
When I read a file with readtxt() and the dtype is set to string, the contents are changed if the column only contains integers. I need to read IDs, which can be integers, but can also contain characters, so i need to read them as strings.
Namely, the last number (in this case 100000) loses one "0" and becomes 10000. Frankly, this only happens to the last number. Even weirder, this only happens if the list ends with a number ending with a zero. It took me hours to track down this issue in my code. Do you have an idea why this happens?
Reproducing code example:
The text was updated successfully, but these errors were encountered: