Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ZTransducer.splitLines fails with OutOfMemoryError #3659

@TobiasPfeifer

Description

@TobiasPfeifer

I encounter an OutOfMemoryError when using ZTransducer.splitLines. However I can read the full file content into a String using Apache Commons FileUtils.readFileToString.

I would expect a streaming library to not consume more heap space when processing a file line by line than the file length.
The file has a size of ~107 MB (110,499.590 Bytes) with 1568 lines. The largest line is a String of ~224,000 characters.
JVM InitialHeapSize is set to 512MB, MaxHeapSize to 4GB (shouldn't matter since FileUtils can read the file to String).

for {
      file <- args.headOption
        .map(filePath => IO(Paths.get(filePath)))
        .getOrElse(IO.fail("No file specified"))

      inputStream <- IO(Files.newInputStream(file))
      //_ <- console.putStrLn("reading file to memory using FileUtils.readFileToString...")
      //s = FileUtils.readFileToString(file.toFile, Charset.forName("UTF-8"))
      //_ <- console.putStrLn(s"lenght of file in memeory: ${s.length}")
      stream = ZStream.fromInputStream(inputStream)
        .transduce(ZTransducer.utf8Decode)
        .transduce[ZEnv, IOException, String](ZTransducer.splitLines)
      _ <- stream.runDrain
} yield ()
java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:3332)
    at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
    at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
    at java.lang.StringBuilder.append(StringBuilder.java:136)
    at zio.stream.ZTransducer$.$anonfun$splitLines$6(ZTransducer.scala:635)
    at zio.stream.ZTransducer$.$anonfun$splitLines$6$adapted(ZTransducer.scala:634)
    at zio.stream.ZTransducer$$$Lambda$205/1628294067.apply(Unknown Source)
    at zio.Chunk$Singleton.foreach(Chunk.scala:1146)
    at zio.Chunk$Concat.foreach(Chunk.scala:1124)
    at zio.stream.ZTransducer$.$anonfun$splitLines$4(ZTransducer.scala:634)

I'm willing to fix this bug myself, with some help.
From the implementation I'd suggest to also replace String concat in ZTransducer.splitLines with a StringBuilder stored as local mutable state. However I'm not sure if there is a restriction on using local state in ZTransducers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions