@@ -19,7 +19,7 @@ other faster and simpler functions like :func:`~numpy.loadtxt` cannot.
1919 When giving examples, we will use the following conventions::
2020
2121 >>> import numpy as np
22- >>> from io import BytesIO
22+ >>> from io import StringIO
2323
2424
2525
@@ -30,7 +30,7 @@ The only mandatory argument of :func:`~numpy.genfromtxt` is the source of
3030the data. It can be a string, a list of strings, or a generator. If a
3131single string is provided, it is assumed to be the name of a local or
3232remote file, or an open file-like object with a :meth: `read ` method, for
33- example, a file or :class: `StringIO .StringIO ` object. If a list of strings
33+ example, a file or :class: `io .StringIO ` object. If a list of strings
3434or a generator returning strings is provided, each string is treated as one
3535line in a file. When the URL of a remote file is passed, the file is
3636automatically downloaded to the current directory and opened.
@@ -58,8 +58,8 @@ Quite often, a single character marks the separation between columns. For
5858example, comma-separated files (CSV) use a comma (``, ``) or a semicolon
5959(``; ``) as delimiter::
6060
61- >>> data = "1, 2, 3\n4, 5, 6"
62- >>> np.genfromtxt(BytesIO (data), delimiter=",")
61+ >>> data = u "1, 2, 3\n4, 5, 6"
62+ >>> np.genfromtxt(StringIO (data), delimiter=",")
6363 array([[ 1., 2., 3.],
6464 [ 4., 5., 6.]])
6565
@@ -74,13 +74,13 @@ defined as a given number of characters. In that case, we need to set
7474``delimiter `` to a single integer (if all the columns have the same
7575size) or to a sequence of integers (if columns can have different sizes)::
7676
77- >>> data = " 1 2 3\n 4 5 67\n890123 4"
78- >>> np.genfromtxt(BytesIO (data), delimiter=3)
77+ >>> data = u " 1 2 3\n 4 5 67\n890123 4"
78+ >>> np.genfromtxt(StringIO (data), delimiter=3)
7979 array([[ 1., 2., 3.],
8080 [ 4., 5., 67.],
8181 [ 890., 123., 4.]])
82- >>> data = "123456789\n 4 7 9\n 4567 9"
83- >>> np.genfromtxt(BytesIO (data), delimiter=(4, 3, 2))
82+ >>> data = u "123456789\n 4 7 9\n 4567 9"
83+ >>> np.genfromtxt(StringIO (data), delimiter=(4, 3, 2))
8484 array([[ 1234., 567., 89.],
8585 [ 4., 7., 9.],
8686 [ 4., 567., 9.]])
@@ -94,14 +94,14 @@ individual entries are not stripped of leading nor trailing white spaces.
9494This behavior can be overwritten by setting the optional argument
9595``autostrip `` to a value of ``True ``::
9696
97- >>> data = "1, abc , 2\n 3, xxx, 4"
97+ >>> data = u "1, abc , 2\n 3, xxx, 4"
9898 >>> # Without autostrip
99- >>> np.genfromtxt(BytesIO (data), delimiter=",", dtype="|U5")
99+ >>> np.genfromtxt(StringIO (data), delimiter=",", dtype="|U5")
100100 array([['1', ' abc ', ' 2'],
101101 ['3', ' xxx', ' 4']],
102102 dtype='|U5')
103103 >>> # With autostrip
104- >>> np.genfromtxt(BytesIO (data), delimiter=",", dtype="|U5", autostrip=True)
104+ >>> np.genfromtxt(StringIO (data), delimiter=",", dtype="|U5", autostrip=True)
105105 array([['1', 'abc', '2'],
106106 ['3', 'xxx', '4']],
107107 dtype='|U5')
@@ -116,7 +116,7 @@ string that marks the beginning of a comment. By default,
116116occur anywhere on the line. Any character present after the comment
117117marker(s) is simply ignored::
118118
119- >>> data = """#
119+ >>> data = u """#
120120 ... # Skip me !
121121 ... # Skip me too !
122122 ... 1, 2
@@ -126,7 +126,7 @@ marker(s) is simply ignored::
126126 ... # And here comes the last line
127127 ... 9, 0
128128 ... """
129- >>> np.genfromtxt(BytesIO (data), comments="#", delimiter=",")
129+ >>> np.genfromtxt(StringIO (data), comments="#", delimiter=",")
130130 [[ 1. 2.]
131131 [ 3. 4.]
132132 [ 5. 6.]
@@ -156,10 +156,10 @@ of lines to skip at the beginning of the file, before any other action is
156156performed. Similarly, we can skip the last ``n `` lines of the file by
157157using the ``skip_footer `` attribute and giving it a value of ``n ``::
158158
159- >>> data = "\n".join(str(i) for i in range(10))
160- >>> np.genfromtxt(BytesIO (data),)
159+ >>> data = u "\n".join(str(i) for i in range(10))
160+ >>> np.genfromtxt(StringIO (data),)
161161 array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
162- >>> np.genfromtxt(BytesIO (data),
162+ >>> np.genfromtxt(StringIO (data),
163163 ... skip_header=3, skip_footer=5)
164164 array([ 3., 4.])
165165
@@ -180,21 +180,21 @@ integers behave the same as regular Python negative indexes.
180180For example, if we want to import only the first and the last columns, we
181181can use ``usecols=(0, -1) ``::
182182
183- >>> data = "1 2 3\n4 5 6"
184- >>> np.genfromtxt(BytesIO (data), usecols=(0, -1))
183+ >>> data = u "1 2 3\n4 5 6"
184+ >>> np.genfromtxt(StringIO (data), usecols=(0, -1))
185185 array([[ 1., 3.],
186186 [ 4., 6.]])
187187
188188If the columns have names, we can also select which columns to import by
189189giving their name to the ``usecols `` argument, either as a sequence
190190of strings or a comma-separated string::
191191
192- >>> data = "1 2 3\n4 5 6"
193- >>> np.genfromtxt(BytesIO (data),
192+ >>> data = u "1 2 3\n4 5 6"
193+ >>> np.genfromtxt(StringIO (data),
194194 ... names="a, b, c", usecols=("a", "c"))
195195 array([(1.0, 3.0), (4.0, 6.0)],
196196 dtype=[('a', '<f8'), ('c', '<f8')])
197- >>> np.genfromtxt(BytesIO (data),
197+ >>> np.genfromtxt(StringIO (data),
198198 ... names="a, b, c", usecols=("a, c"))
199199 array([(1.0, 3.0), (4.0, 6.0)],
200200 dtype=[('a', '<f8'), ('c', '<f8')])
@@ -252,15 +252,15 @@ A natural approach when dealing with tabular data is to allocate a name to
252252each column. A first possibility is to use an explicit structured dtype,
253253as mentioned previously::
254254
255- >>> data = BytesIO ("1 2 3\n 4 5 6")
255+ >>> data = StringIO ("1 2 3\n 4 5 6")
256256 >>> np.genfromtxt(data, dtype=[(_, int) for _ in "abc"])
257257 array([(1, 2, 3), (4, 5, 6)],
258258 dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<i8')])
259259
260260Another simpler possibility is to use the ``names `` keyword with a
261261sequence of strings or a comma-separated string::
262262
263- >>> data = BytesIO ("1 2 3\n 4 5 6")
263+ >>> data = StringIO ("1 2 3\n 4 5 6")
264264 >>> np.genfromtxt(data, names="A, B, C")
265265 array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)],
266266 dtype=[('A', '<f8'), ('B', '<f8'), ('C', '<f8')])
@@ -274,7 +274,7 @@ that case, we must use the ``names`` keyword with a value of
274274``True ``. The names will then be read from the first line (after the
275275``skip_header `` ones), even if the line is commented out::
276276
277- >>> data = BytesIO ("So it goes\n#a b c\n1 2 3\n 4 5 6")
277+ >>> data = StringIO ("So it goes\n#a b c\n1 2 3\n 4 5 6")
278278 >>> np.genfromtxt(data, skip_header=1, names=True)
279279 array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)],
280280 dtype=[('a', '<f8'), ('b', '<f8'), ('c', '<f8')])
@@ -283,7 +283,7 @@ The default value of ``names`` is ``None``. If we give any other
283283value to the keyword, the new names will overwrite the field names we may
284284have defined with the dtype::
285285
286- >>> data = BytesIO ("1 2 3\n 4 5 6")
286+ >>> data = StringIO ("1 2 3\n 4 5 6")
287287 >>> ndtype=[('a',int), ('b', float), ('c', int)]
288288 >>> names = ["A", "B", "C"]
289289 >>> np.genfromtxt(data, names=names, dtype=ndtype)
@@ -298,23 +298,23 @@ If ``names=None`` but a structured dtype is expected, names are defined
298298with the standard NumPy default of ``"f%i" ``, yielding names like ``f0 ``,
299299``f1 `` and so forth::
300300
301- >>> data = BytesIO ("1 2 3\n 4 5 6")
301+ >>> data = StringIO ("1 2 3\n 4 5 6")
302302 >>> np.genfromtxt(data, dtype=(int, float, int))
303303 array([(1, 2.0, 3), (4, 5.0, 6)],
304304 dtype=[('f0', '<i8'), ('f1', '<f8'), ('f2', '<i8')])
305305
306306In the same way, if we don't give enough names to match the length of the
307307dtype, the missing names will be defined with this default template::
308308
309- >>> data = BytesIO ("1 2 3\n 4 5 6")
309+ >>> data = StringIO ("1 2 3\n 4 5 6")
310310 >>> np.genfromtxt(data, dtype=(int, float, int), names="a")
311311 array([(1, 2.0, 3), (4, 5.0, 6)],
312312 dtype=[('a', '<i8'), ('f0', '<f8'), ('f1', '<i8')])
313313
314314We can overwrite this default with the ``defaultfmt `` argument, that
315315takes any format string::
316316
317- >>> data = BytesIO ("1 2 3\n 4 5 6")
317+ >>> data = StringIO ("1 2 3\n 4 5 6")
318318 >>> np.genfromtxt(data, dtype=(int, float, int), defaultfmt="var_%02i")
319319 array([(1, 2.0, 3), (4, 5.0, 6)],
320320 dtype=[('var_00', '<i8'), ('var_01', '<f8'), ('var_02', '<i8')])
@@ -377,10 +377,10 @@ In the following example, the second column is converted from as string
377377representing a percentage to a float between 0 and 1::
378378
379379 >>> convertfunc = lambda x: float(x.strip("%"))/100.
380- >>> data = "1, 2.3%, 45.\n6, 78.9%, 0"
380+ >>> data = u "1, 2.3%, 45.\n6, 78.9%, 0"
381381 >>> names = ("i", "p", "n")
382382 >>> # General case .....
383- >>> np.genfromtxt(BytesIO (data), delimiter=",", names=names)
383+ >>> np.genfromtxt(StringIO (data), delimiter=",", names=names)
384384 array([(1.0, nan, 45.0), (6.0, nan, 0.0)],
385385 dtype=[('i', '<f8'), ('p', '<f8'), ('n', '<f8')])
386386
@@ -390,7 +390,7 @@ and ``' 78.9%'`` cannot be converted to float and we end up having
390390``np.nan `` instead. Let's now use a converter::
391391
392392 >>> # Converted case ...
393- >>> np.genfromtxt(BytesIO (data), delimiter=",", names=names,
393+ >>> np.genfromtxt(StringIO (data), delimiter=",", names=names,
394394 ... converters={1: convertfunc})
395395 array([(1.0, 0.023, 45.0), (6.0, 0.78900000000000003, 0.0)],
396396 dtype=[('i', '<f8'), ('p', '<f8'), ('n', '<f8')])
@@ -399,7 +399,7 @@ The same results can be obtained by using the name of the second column
399399(``"p" ``) as key instead of its index (1)::
400400
401401 >>> # Using a name for the converter ...
402- >>> np.genfromtxt(BytesIO (data), delimiter=",", names=names,
402+ >>> np.genfromtxt(StringIO (data), delimiter=",", names=names,
403403 ... converters={"p": convertfunc})
404404 array([(1.0, 0.023, 45.0), (6.0, 0.78900000000000003, 0.0)],
405405 dtype=[('i', '<f8'), ('p', '<f8'), ('n', '<f8')])
@@ -411,9 +411,9 @@ string into the corresponding float or into -999 if the string is empty.
411411We need to explicitly strip the string from white spaces as it is not done
412412by default::
413413
414- >>> data = "1, , 3\n 4, 5, 6"
414+ >>> data = u "1, , 3\n 4, 5, 6"
415415 >>> convert = lambda x: float(x.strip() or -999)
416- >>> np.genfromtxt(BytesIO (data), delimiter=",",
416+ >>> np.genfromtxt(StringIO (data), delimiter=",",
417417 ... converters={1: convert})
418418 array([[ 1., -999., 3.],
419419 [ 4., 5., 6.]])
@@ -489,13 +489,13 @@ with ``"N/A"`` in the first column and by ``"???"`` in the third column.
489489We wish to transform these missing values to 0 if they occur in the first
490490and second column, and to -999 if they occur in the last column::
491491
492- >>> data = "N/A, 2, 3\n4, ,???"
492+ >>> data = u "N/A, 2, 3\n4, ,???"
493493 >>> kwargs = dict(delimiter=",",
494494 ... dtype=int,
495495 ... names="a,b,c",
496496 ... missing_values={0:"N/A", 'b':" ", 2:"???"},
497497 ... filling_values={0:0, 'b':0, 2:-999})
498- >>> np.genfromtxt(BytesIO (data), **kwargs)
498+ >>> np.genfromtxt(StringIO (data), **kwargs)
499499 array([(0, 2, 3), (4, 0, -999)],
500500 dtype=[('a', '<i8'), ('b', '<i8'), ('c', '<i8')])
501501
0 commit comments