Thanks to visit codestin.com
Credit goes to github.com

Skip to content

loadtxt() changes numbers if integers are read as strings #17277

Closed
@fratajcz

Description

@fratajcz

When I read a file with readtxt() and the dtype is set to string, the contents are changed if the column only contains integers. I need to read IDs, which can be integers, but can also contain characters, so i need to read them as strings.

Namely, the last number (in this case 100000) loses one "0" and becomes 10000. Frankly, this only happens to the last number. Even weirder, this only happens if the list ends with a number ending with a zero. It took me hours to track down this issue in my code. Do you have an idea why this happens?

Reproducing code example:

>>> import numpy as np
>>> import pandas as pd

>>> liste = list(range(1,100001))
>>> df = pd.DataFrame(liste)
>>> df
            0
0           1
1           2
2           3
3           4
4           5
...       ...
99995   99996
99996   99997
99997   99998
99998   99999
99999  100000

[100000 rows x 1 columns]
>>> df.to_csv("testfile",header=False,index=False)
>>> liste2 = np.loadtxt("testfile",dtype="str",delimiter=",",skiprows=0,usecols=0)
>>> liste2[-1]
'10000'
>>> liste2 = np.loadtxt("testfile",dtype="int",delimiter=",",skiprows=0,usecols=0)
>>> liste2[-1]
100000
>>> liste2 = np.loadtxt("testfile",dtype="str",delimiter=",",skiprows=0,usecols=0)
>>> liste1str = list(map(str,liste))
>>> liste1str == liste2
array([ True,  True,  True, ...,  True,  True, False])
>>> liste2[99]
'100'
>>> liste2[999]
'1000'
>>> liste2[9999]
'10000'
>>> liste2[99999]
'10000'

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions