Thanks to visit codestin.com
Credit goes to github.com

Skip to content

IO.copy_stream double-writes to InternetMessageIO #6555

@kalenp

Description

@kalenp

Originally discovered when using the Net::SMTP library, when using smtp_object.open_message_stream which provides an output IO object to write to. We were using IO.copy_stream to copy data to that stream, and noticed that sometimes the data would be corrupted.

Repro

Stripping down the original scenario, I was able to get the following reproduction script which generates some data in a StringIO and then copies it to the Net::InternetMessageIO, which is the class used by Net::SMTP.

require "net/protocol"
require "socket"
require "stringio"

ADDRESS = "127.0.0.1"
PORT = 3456

METHOD = ARGV[0]

TOTAL_BYTES = 32 * 1024
LINE_LENGTH = 128
LINE_COUNT = TOTAL_BYTES / LINE_LENGTH
FILLER = "a" * (LINE_LENGTH - 8)
DATA = Range.new(1, LINE_COUNT).map { |i| sprintf("% 5d %s", i, FILLER) }.join("\r\n")

StringIO.open(DATA) do |reader|
  socket = Net::InternetMessageIO.new(TCPSocket.open(ADDRESS, PORT))
  socket.write_message_by_block do |socket_writer|
    bytes_written = 0

    if METHOD == "copy_stream"
      puts "Using IO.copy_stream"
      bytes_written += IO.copy_stream(reader, socket_writer)
    elsif METHOD == "manual"
      puts "Using manual stream copying"
      while !reader.eof?
        bytes_written += socket_writer.write(reader.read(8 * 1024))
      end
    else
      raise "Unknown method: #{METHOD}"
    end

    puts "#{bytes_written} bytes written"
  end
end

Doing a test run, comparing against MRI

$ rbenv local jruby-9.2.13.0
$ 
$ netcat -l 127.0.0.1 3456 > output &
$ ruby copy_stream.rb copy_stream
Using IO.copy_stream
32766 bytes written
$ wc -c output
40961 output
$ 
$ netcat -l 127.0.0.1 3456 > output &
$ ruby copy_stream.rb manual
Using manual stream copying
32640 bytes written
$ wc -c output
32771 output
$
$ rbenv local 2.5.8
$
$ netcat -l 127.0.0.1 3456 > output &
$ ruby copy_stream.rb copy_stream
Using IO.copy_stream
32640 bytes written
$ wc -c output
32771 output
$
$ netcat -l 127.0.0.1 3456 > output &
$ ruby copy_stream.rb manual
Using manual stream copying
32640 bytes written
$ wc -c output
32771 output
$

You can see from the wc -c counts that MRI and JRuby + manual copying of blocks all output the same byte count (and looking at the files, they're all the same contents). But JRuby + IO.copy_stream produces another ~8k worth of duplicated data. The returned byte count is also inconsistent with JRuby + IO.copy_stream vs the others. All of these differ in what was returned vs what was actually written, presumably because InternetMessageIO is adding some extra characters somewhere along the way (looks like an extra byte per line, plus an extra line consisting of ".\r\n").

Workaround

We have a workaround here of doing manual copying by blocks.

Affected JRuby Versions

JRuby 9.2.13.0 and 9.2.14.0 have this bug
JRuby 9.2.9.0 through 9.2.12.0 seem to work correctly here

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions