-
Couldn't load subscription status.
- Fork 1.4k
inflate and gunzip transducers #3825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Would be great to have property-based tests:
|
|
There is only decompression implemented in this PR, so I think I could add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work @LGLO. I left some minor comments and agree on the necessity of the property check. Big big thanks for working on this!
| } | ||
| .map { | ||
| case (buffer, inflater) => { | ||
| case None => ZIO.succeed(Chunk.empty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a transducer receives a None, it should treat it as an end-of-stream marker, so at this point, it should flush everything it has (so it should do inflater.inflate) and prepare for a new set of input chunks (inflater.reset).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pullOutput, which I'll rename to pullAllOutput does it, when after reading some part of data it is .finished(). What is more if there are multiple delfated parts inside one stream, it would work.
However it wouldn't work if not whole deflated part was transduced (Inflater wouldn't be .finished()). Thanks for catching this! I'll add test for this case and fix!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iravid actually I don't know if I can test it at ZStream level, should I write the test on level of calling push?
|
|
||
| import zio._ | ||
|
|
||
| private[compression] class Gunzipper private (var state: Gunzipper.State) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest making the functionality in this file mutable and ZIO-less. It'd be much faster this way and you're only doing synchronous effects and error handling, so you can just replace that with exceptions.
|
|
||
| private val fixedHeaderLength = 10 | ||
|
|
||
| sealed trait State { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love how you organized the steps here. I'm not knowledgeable about the GZIP encoding, so I won't review everything here, but it looks great from the maintenance angle.
Hopefully ZIO Codec will make such parsers easier to write 💪🏻
|
Yeah that’s fine. There’s an abstraction in the transducer tests that lets you manually drive the transducer.
…On 21 Jun 2020, 10:56 +0300, Lech Głowiak ***@***.***>, wrote:
@LGLO commented on this pull request.
In streams/jvm/src/main/scala/zio/stream/compression/package.scala:
> + * HTTP 'deflate' content-encoding should use nowrap = false. See .
+ * */
+ def inflate(
+ bufferSize: Int = 64 * 1024,
+ noWrap: Boolean = false
+ ): ZTransducer[Any, CompressionException, Byte, Byte] = {
+ def makeInflater(
+ bufferSize: Int
+ ): ZManaged[Any, Nothing, Option[zio.Chunk[Byte]] => ZIO[Any, CompressionException, Chunk[Byte]]] =
+ ZManaged
+ .make(ZIO.effectTotal((new Array[Byte](bufferSize), new Inflater(noWrap)))) {
+ case (_, inflater) => ZIO.effectTotal(inflater.end())
+ }
+ .map {
+ case (buffer, inflater) => {
+ case None => ZIO.succeed(Chunk.empty)
@iravid actually I don't know if I can test it at ZStream level, should I write the test on level of calling push?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
b733b8b to
6bfc164
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LGLO Looks excellent. This is good to merge by me if you don't have any planned follow-ups. I left some minor comments but they're not critical.
|
|
||
| def close(): Unit = state.close() | ||
|
|
||
| def onChunk(c: Chunk[Byte]): ZIO[Any, CompressionException, Chunk[Byte]] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of code organization, I'd be fine with this just being : Chunk[Byte] and wrapping the exceptions into the CompressionException in compression.gunzip.
| .map { | ||
| case (buffer, inflater) => { | ||
| case None => | ||
| ZIO.succeed { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically this should be ZIO.effectTotal (ZIO.succeed is lazy and just forwards to effectTotal but it's nicer for maintainability to use effectTotal).
|
|
||
| private class ParseHeaderStep(acc: Array[Byte], crc32: CRC32) extends State { | ||
|
|
||
| //TODO: If whole input is shorther than fixed header, not output is produced and no error is singaled. Is it ok? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like we should throw an exception if the input does not contain a valid header. Is that possible to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I change:
case None =>
ZIO.succeed {
gunzipper.reset()
Chunk.empty
}to
case None =>
ZIO.succeed {
if(gunzipper.isFinished()){
gunzipper.reset()
Chunk.empty
} else {
<fail because stream wasn't complete gzipped>
}
}
``` ?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer the error and not the empty chunk :) Is there an "auto decompress" path that decompresses if the header exists, and leaves the bytes intact if not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we would like to have such path, then invalid header would have to fallback to clear text stream. Because there is no criterion to distinguish between invalid header and no header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LGLO To your question: Yes, you should add that else branch, but you should use ZIO.effect instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I didn't left this work here. Hopefully I could spend some time today in the evening and next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @LGLO!
|
Oh, one last request: can we move the |
|
I'll move this to trait and mix in to platform 👍 |
… bug in pulling all from inflater
|
Docs are added. Tests against more involved headers are added (and some bugs fixed as well). Removed WIP. |
|
@LGLO Fantastic work and coverage. Thank you!! |
Resolves #3879
Ready for review
deflate and gzip are out of scope of this PR, decompression only