Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit bee48b0

Browse files
RDoc Recipes for write converters and RFC 4180 compliance (#185)
1 parent f0bab6a commit bee48b0

File tree

2 files changed

+209
-17
lines changed

2 files changed

+209
-17
lines changed

doc/csv/recipes/generating.rdoc

+36
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,9 @@ All code snippets on this page assume that the following has been executed:
1717
- {Generating to IO an Stream}[#label-Generating+to+an+IO+Stream]
1818
- {Recipe: Generate to IO Stream with Headers}[#label-Recipe-3A+Generate+to+IO+Stream+with+Headers]
1919
- {Recipe: Generate to IO Stream Without Headers}[#label-Recipe-3A+Generate+to+IO+Stream+Without+Headers]
20+
- {Converting Fields}[#label-Converting+Fields]
21+
- {Recipe: Filter Generated Field Strings}[#label-Recipe-3A+Filter+Generated+Field+Strings]
22+
- {Recipe: Specify Multiple Write Converters}[#label-Recipe-3A+Specify+Multiple+Write+Converters]
2023

2124
=== Output Formats
2225

@@ -111,3 +114,36 @@ Use class method CSV.new without option +headers+ to generate \CSV data to an \I
111114
csv << ['Baz', 2]
112115
end
113116
p File.read(path) # => "Foo,0\nBar,1\nBaz,2\n"
117+
118+
=== Converting Fields
119+
120+
You can use _write_ _converters_ to convert fields when generating \CSV.
121+
122+
==== Recipe: Filter Generated Field Strings
123+
124+
Use option <tt>:write_converters</tt> and a custom converter to convert field values when generating \CSV.
125+
126+
This example defines and uses a custom write converter to strip whitespace from generated fields:
127+
strip_converter = proc {|field| field.respond_to?(:strip) ? field.strip : field }
128+
output_string = CSV.generate(write_converters: strip_converter) do |csv|
129+
csv << [' foo ', 0]
130+
csv << [' bar ', 1]
131+
csv << [' baz ', 2]
132+
end
133+
output_string # => "foo,0\nbar,1\nbaz,2\n"
134+
135+
==== Recipe: Specify Multiple Write Converters
136+
137+
Use option <tt>:write_converters</tt> and multiple custom coverters
138+
to convert field values when generating \CSV.
139+
140+
This example defines and uses two custom write converters to strip and upcase generated fields:
141+
strip_converter = proc {|field| field.respond_to?(:strip) ? field.strip : field }
142+
upcase_converter = proc {|field| field.respond_to?(:upcase) ? field.upcase : field }
143+
converters = [strip_converter, upcase_converter]
144+
output_string = CSV.generate(write_converters: converters) do |csv|
145+
csv << [' foo ', 0]
146+
csv << [' bar ', 1]
147+
csv << [' baz ', 2]
148+
end
149+
output_string # => "FOO,0\nBAR,1\nBAZ,2\n"

doc/csv/recipes/parsing.rdoc

+173-17
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,25 @@ All code snippets on this page assume that the following has been executed:
1717
- {Parsing from an IO Stream}[#label-Parsing+from+an+IO+Stream]
1818
- {Recipe: Parse from IO Stream with Headers}[#label-Recipe-3A+Parse+from+IO+Stream+with+Headers]
1919
- {Recipe: Parse from IO Stream Without Headers}[#label-Recipe-3A+Parse+from+IO+Stream+Without+Headers]
20+
- {RFC 4180 Compliance}[#label-RFC+4180+Compliance]
21+
- {Row Separator}[#label-Row+Separator]
22+
- {Recipe: Handle Compliant Row Separator}[#label-Recipe-3A+Handle+Compliant+Row+Separator]
23+
- {Recipe: Handle Non-Compliant Row Separator}[#label-Recipe-3A+Handle+Non-Compliant+Row+Separator]
24+
- {Column Separator}[#label-Column+Separator]
25+
- {Recipe: Handle Compliant Column Separator}[#label-Recipe-3A+Handle+Compliant+Column+Separator]
26+
- {Recipe: Handle Non-Compliant Column Separator}[#label-Recipe-3A+Handle+Non-Compliant+Column+Separator]
27+
- {Quote Character}[#label-Quote+Character]
28+
- {Recipe: Handle Compliant Quote Character}[#label-Recipe-3A+Handle+Compliant+Quote+Character]
29+
- {Recipe: Handle Non-Compliant Quote Character}[#label-Recipe-3A+Handle+Non-Compliant+Quote+Character]
30+
- {Recipe: Allow Liberal Parsing}[#label-Recipe-3A+Allow+Liberal+Parsing]
31+
- {Special Handling}[#label-Special+Handling]
32+
- {Special Line Handling}[#label-Special+Line+Handling]
33+
- {Recipe: Ignore Blank Lines}[#label-Recipe-3A+Ignore+Blank+Lines]
34+
- {Recipe: Ignore Selected Lines}[#label-Recipe-3A+Ignore+Selected+Lines]
35+
- {Special Field Handling}[#label-Special+Field+Handling]
36+
- {Recipe: Strip Fields}[#label-Recipe-3A+Strip+Fields]
37+
- {Recipe: Handle Null Fields}[#label-Recipe-3A+Handle+Null+Fields]
38+
- {Recipe: Handle Empty Fields}[#label-Recipe-3A+Handle+Empty+Fields]
2039
- {Converting Fields}[#label-Converting+Fields]
2140
- {Converting Fields to Objects}[#label-Converting+Fields+to+Objects]
2241
- {Recipe: Convert Fields to Integers}[#label-Recipe-3A+Convert+Fields+to+Integers]
@@ -164,6 +183,143 @@ Output:
164183
["bar", "1"]
165184
["baz", "2"]
166185

186+
=== RFC 4180 Compliance
187+
188+
By default, \CSV parses data that is compliant with
189+
{RFC 4180}[https://tools.ietf.org/html/rfc4180]
190+
with respect to:
191+
- Row separator.
192+
- Column separator.
193+
- Quote character.
194+
195+
==== Row Separator
196+
197+
RFC 4180 specifies the row separator CRLF (Ruby "\r\n").
198+
199+
Although the \CSV default row separator is "\n",
200+
the parser also by default handles row seperator "\r" and the RFC-compliant "\r\n".
201+
202+
===== Recipe: Handle Compliant Row Separator
203+
204+
For strict compliance, use option +:row_sep+ to specify row separator "\r\n",
205+
which allows the compliant row separator:
206+
source = "foo,1\r\nbar,1\r\nbaz,2\r\n"
207+
CSV.parse(source, row_sep: "\r\n") # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]
208+
But rejects other row separators:
209+
source = "foo,1\nbar,1\nbaz,2\n"
210+
CSV.parse(source, row_sep: "\r\n") # Raised MalformedCSVError
211+
source = "foo,1\rbar,1\rbaz,2\r"
212+
CSV.parse(source, row_sep: "\r\n") # Raised MalformedCSVError
213+
source = "foo,1\n\rbar,1\n\rbaz,2\n\r"
214+
CSV.parse(source, row_sep: "\r\n") # Raised MalformedCSVError
215+
216+
===== Recipe: Handle Non-Compliant Row Separator
217+
218+
For data with non-compliant row separators, use option +:row_sep+.
219+
This example source uses semicolon (';') as its row separator:
220+
source = "foo,1;bar,1;baz,2;"
221+
CSV.parse(source, row_sep: ';') # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]
222+
223+
==== Column Separator
224+
225+
RFC 4180 specifies column separator COMMA (Ruby ',').
226+
227+
===== Recipe: Handle Compliant Column Separator
228+
229+
Because the \CSV default comma separator is ',',
230+
you need not specify option +:col_sep+ for compliant data:
231+
source = "foo,1\nbar,1\nbaz,2\n"
232+
CSV.parse(source) # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]
233+
234+
===== Recipe: Handle Non-Compliant Column Separator
235+
236+
For data with non-compliant column separators, use option +:col_sep+.
237+
This example source uses TAB ("\t") as its column separator:
238+
source = "foo,1\tbar,1\tbaz,2"
239+
CSV.parse(source, col_sep: "\t") # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]
240+
241+
==== Quote Character
242+
243+
RFC 4180 specifies quote character DQUOTE (Ruby '"').
244+
245+
===== Recipe: Handle Compliant Quote Character
246+
247+
Because the \CSV default quote character is '"',
248+
you need not specify option +:quote_char+ for compliant data:
249+
source = "\"foo\",\"1\"\n\"bar\",\"1\"\n\"baz\",\"2\"\n"
250+
CSV.parse(source) # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]
251+
252+
===== Recipe: Handle Non-Compliant Quote Character
253+
254+
For data with non-compliant quote characters, use option +:quote_char+.
255+
This example source uses SQUOTE ("'") as its quote character:
256+
source = "'foo','1'\n'bar','1'\n'baz','2'\n"
257+
CSV.parse(source, quote_char: "'") # => [["foo", "1"], ["bar", "1"], ["baz", "2"]]
258+
259+
==== Recipe: Allow Liberal Parsing
260+
261+
Use option +:liberal_parsing+ to specify that \CSV should
262+
attempt to parse input not conformant with RFC 4180, such as double quotes in unquoted fields:
263+
source = 'is,this "three, or four",fields'
264+
CSV.parse(source) # Raises MalformedCSVError
265+
CSV.parse(source, liberal_parsing: true) # => [["is", "this \"three", " or four\"", "fields"]]
266+
267+
=== Special Handling
268+
269+
You can use parsing options to specify special handling for certain lines and fields.
270+
271+
==== Special Line Handling
272+
273+
Use parsing options to specify special handling for blank lines, or for other selected lines.
274+
275+
===== Recipe: Ignore Blank Lines
276+
277+
Use option +:skip_blanks+ to ignore blank lines:
278+
source = <<-EOT
279+
foo,0
280+
281+
bar,1
282+
baz,2
283+
284+
,
285+
EOT
286+
parsed = CSV.parse(source, skip_blanks: true)
287+
parsed # => [["foo", "0"], ["bar", "1"], ["baz", "2"], [nil, nil]]
288+
289+
===== Recipe: Ignore Selected Lines
290+
291+
Use option +:skip_lines+ to ignore selected lines.
292+
source = <<-EOT
293+
# Comment
294+
foo,0
295+
bar,1
296+
baz,2
297+
# Another comment
298+
EOT
299+
parsed = CSV.parse(source, skip_lines: /^#/)
300+
parsed # => [["foo", "0"], ["bar", "1"], ["baz", "2"]]
301+
302+
==== Special Field Handling
303+
304+
Use parsing options to specify special handling for certain field values.
305+
306+
===== Recipe: Strip Fields
307+
308+
Use option +:strip+ to strip parsed field values:
309+
CSV.parse_line(' a , b ', strip: true) # => ["a", "b"]
310+
311+
===== Recipe: Handle Null Fields
312+
313+
Use option +:nil_value+ to specify a value that will replace each field
314+
that is null (no text):
315+
CSV.parse_line('a,,b,,c', nil_value: 0) # => ["a", 0, "b", 0, "c"]
316+
317+
===== Recipe: Handle Empty Fields
318+
319+
Use option +:empty_value+ to specify a value that will replace each field
320+
that is empty (\String of length 0);
321+
CSV.parse_line('a,"",b,"",c', empty_value: 'x') # => ["a", "x", "b", "x", "c"]
322+
167323
=== Converting Fields
168324

169325
You can use field converters to change parsed \String fields into other objects,
@@ -180,49 +336,49 @@ There are built-in field converters for converting to objects of certain classes
180336
- \DateTime
181337

182338
Other built-in field converters include:
183-
- <tt>:numeric</tt>: converts to \Integer and \Float.
184-
- <tt>:all</tt>: converts to \DateTime, \Integer, \Float.
339+
- +:numeric+: converts to \Integer and \Float.
340+
- +:all+: converts to \DateTime, \Integer, \Float.
185341

186342
You can also define field converters to convert to objects of other classes.
187343

188344
===== Recipe: Convert Fields to Integers
189345

190-
Convert fields to \Integer objects using built-in converter <tt>:integer</tt>:
346+
Convert fields to \Integer objects using built-in converter +:integer+:
191347
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
192348
parsed = CSV.parse(source, headers: true, converters: :integer)
193349
parsed.map {|row| row['Value'].class} # => [Integer, Integer, Integer]
194350

195351
===== Recipe: Convert Fields to Floats
196352

197-
Convert fields to \Float objects using built-in converter <tt>:float</tt>:
353+
Convert fields to \Float objects using built-in converter +:float+:
198354
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
199355
parsed = CSV.parse(source, headers: true, converters: :float)
200356
parsed.map {|row| row['Value'].class} # => [Float, Float, Float]
201357

202358
===== Recipe: Convert Fields to Numerics
203359

204-
Convert fields to \Integer and \Float objects using built-in converter <tt>:numeric</tt>:
360+
Convert fields to \Integer and \Float objects using built-in converter +:numeric+:
205361
source = "Name,Value\nfoo,0\nbar,1.1\nbaz,2.2\n"
206362
parsed = CSV.parse(source, headers: true, converters: :numeric)
207363
parsed.map {|row| row['Value'].class} # => [Integer, Float, Float]
208364

209365
===== Recipe: Convert Fields to Dates
210366

211-
Convert fields to \Date objects using built-in converter <tt>:date</tt>:
367+
Convert fields to \Date objects using built-in converter +:date+:
212368
source = "Name,Date\nfoo,2001-02-03\nbar,2001-02-04\nbaz,2001-02-03\n"
213369
parsed = CSV.parse(source, headers: true, converters: :date)
214370
parsed.map {|row| row['Date'].class} # => [Date, Date, Date]
215371

216372
===== Recipe: Convert Fields to DateTimes
217373

218-
Convert fields to \DateTime objects using built-in converter <tt>:date_time</tt>:
374+
Convert fields to \DateTime objects using built-in converter +:date_time+:
219375
source = "Name,DateTime\nfoo,2001-02-03\nbar,2001-02-04\nbaz,2020-05-07T14:59:00-05:00\n"
220376
parsed = CSV.parse(source, headers: true, converters: :date_time)
221377
parsed.map {|row| row['DateTime'].class} # => [DateTime, DateTime, DateTime]
222378

223379
===== Recipe: Convert Assorted Fields to Objects
224380

225-
Convert assorted fields to objects using built-in converter <tt>:all</tt>:
381+
Convert assorted fields to objects using built-in converter +:all+:
226382
source = "Type,Value\nInteger,0\nFloat,1.0\nDateTime,2001-02-04\n"
227383
parsed = CSV.parse(source, headers: true, converters: :all)
228384
parsed.map {|row| row['Value'].class} # => [Integer, Float, DateTime]
@@ -265,12 +421,12 @@ then refer to the converter by its name:
265421
==== Using Multiple Field Converters
266422

267423
You can use multiple field converters in either of these ways:
268-
- Specify converters in option <tt>:converters</tt>.
424+
- Specify converters in option +:converters+.
269425
- Specify converters in a custom converter list.
270426

271-
===== Recipe: Specify Multiple Field Converters in Option <tt>:converters</tt>
427+
===== Recipe: Specify Multiple Field Converters in Option +:converters+
272428

273-
Apply multiple field converters by specifying them in option <tt>:conveters</tt>:
429+
Apply multiple field converters by specifying them in option +:conveters+:
274430
source = "Name,Value\nfoo,0\nbar,1.0\nbaz,2.0\n"
275431
parsed = CSV.parse(source, headers: true, converters: [:integer, :float])
276432
parsed['Value'] # => [0, 1.0, 2.0]
@@ -291,21 +447,21 @@ Apply multiple field converters by defining and registering a custom converter l
291447
You can use header converters to modify parsed \String headers.
292448

293449
Built-in header converters include:
294-
- <tt>:symbol</tt>: converts \String header to \Symbol.
295-
- <tt>:downcase</tt>: converts \String header to lowercase.
450+
- +:symbol+: converts \String header to \Symbol.
451+
- +:downcase+: converts \String header to lowercase.
296452

297453
You can also define header converters to otherwise modify header \Strings.
298454

299455
==== Recipe: Convert Headers to Lowercase
300456

301-
Convert headers to lowercase using built-in converter <tt>:downcase</tt>:
457+
Convert headers to lowercase using built-in converter +:downcase+:
302458
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
303459
parsed = CSV.parse(source, headers: true, header_converters: :downcase)
304460
parsed.headers # => ["name", "value"]
305461

306462
==== Recipe: Convert Headers to Symbols
307463

308-
Convert headers to downcased Symbols using built-in converter <tt>:symbol</tt>:
464+
Convert headers to downcased Symbols using built-in converter +:symbol+:
309465
source = "Name,Value\nfoo,0\nbar,1\nbaz,2\n"
310466
parsed = CSV.parse(source, headers: true, header_converters: :symbol)
311467
parsed.headers # => [:name, :value]
@@ -334,12 +490,12 @@ then refer to the converter by its name:
334490
==== Using Multiple Header Converters
335491

336492
You can use multiple header converters in either of these ways:
337-
- Specify header converters in option <tt>:header_converters</tt>.
493+
- Specify header converters in option +:header_converters+.
338494
- Specify header converters in a custom header converter list.
339495

340496
===== Recipe: Specify Multiple Header Converters in Option :header_converters
341497

342-
Apply multiple header converters by specifying them in option <tt>:header_conveters</tt>:
498+
Apply multiple header converters by specifying them in option +:header_conveters+:
343499
source = "Name,Value\nfoo,0\nbar,1.0\nbaz,2.0\n"
344500
parsed = CSV.parse(source, headers: true, header_converters: [:downcase, :symbol])
345501
parsed.headers # => [:name, :value]

0 commit comments

Comments
 (0)