-
-
Notifications
You must be signed in to change notification settings - Fork 933
Description
Originally discovered when using the Net::SMTP library, when using smtp_object.open_message_stream which provides an output IO object to write to. We were using IO.copy_stream to copy data to that stream, and noticed that sometimes the data would be corrupted.
Repro
Stripping down the original scenario, I was able to get the following reproduction script which generates some data in a StringIO and then copies it to the Net::InternetMessageIO, which is the class used by Net::SMTP.
require "net/protocol"
require "socket"
require "stringio"
ADDRESS = "127.0.0.1"
PORT = 3456
METHOD = ARGV[0]
TOTAL_BYTES = 32 * 1024
LINE_LENGTH = 128
LINE_COUNT = TOTAL_BYTES / LINE_LENGTH
FILLER = "a" * (LINE_LENGTH - 8)
DATA = Range.new(1, LINE_COUNT).map { |i| sprintf("% 5d %s", i, FILLER) }.join("\r\n")
StringIO.open(DATA) do |reader|
socket = Net::InternetMessageIO.new(TCPSocket.open(ADDRESS, PORT))
socket.write_message_by_block do |socket_writer|
bytes_written = 0
if METHOD == "copy_stream"
puts "Using IO.copy_stream"
bytes_written += IO.copy_stream(reader, socket_writer)
elsif METHOD == "manual"
puts "Using manual stream copying"
while !reader.eof?
bytes_written += socket_writer.write(reader.read(8 * 1024))
end
else
raise "Unknown method: #{METHOD}"
end
puts "#{bytes_written} bytes written"
end
end
Doing a test run, comparing against MRI
$ rbenv local jruby-9.2.13.0
$
$ netcat -l 127.0.0.1 3456 > output &
$ ruby copy_stream.rb copy_stream
Using IO.copy_stream
32766 bytes written
$ wc -c output
40961 output
$
$ netcat -l 127.0.0.1 3456 > output &
$ ruby copy_stream.rb manual
Using manual stream copying
32640 bytes written
$ wc -c output
32771 output
$
$ rbenv local 2.5.8
$
$ netcat -l 127.0.0.1 3456 > output &
$ ruby copy_stream.rb copy_stream
Using IO.copy_stream
32640 bytes written
$ wc -c output
32771 output
$
$ netcat -l 127.0.0.1 3456 > output &
$ ruby copy_stream.rb manual
Using manual stream copying
32640 bytes written
$ wc -c output
32771 output
$
You can see from the wc -c counts that MRI and JRuby + manual copying of blocks all output the same byte count (and looking at the files, they're all the same contents). But JRuby + IO.copy_stream produces another ~8k worth of duplicated data. The returned byte count is also inconsistent with JRuby + IO.copy_stream vs the others. All of these differ in what was returned vs what was actually written, presumably because InternetMessageIO is adding some extra characters somewhere along the way (looks like an extra byte per line, plus an extra line consisting of ".\r\n").
Workaround
We have a workaround here of doing manual copying by blocks.
Affected JRuby Versions
JRuby 9.2.13.0 and 9.2.14.0 have this bug
JRuby 9.2.9.0 through 9.2.12.0 seem to work correctly here