-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Add Zlib-Intel Opimizations to Linux System.IO.Compression.Native.so #6876
Conversation
It's great that there are perf improvements from this, but at least for right now I would prefer we not do this. We should use whatever zlib the user has on their system, either having built/installed explicitly or gotten from their package manager. cc'ing a few folks for their opinions in case they differ from mine: |
Thanks for the feedback @stephentoub . Could you elaborate a bit more on why you prefer using the system provided library? I would like to see the architectural features being used by the Windows build to be at parity with the Linux build if possible. It sounds like you may be open to this PR in the future? |
In general we want to use the libraries that are standard to that platform. We only ship zlib as part of System.IO.Compression on Windows because there isn't one built into / exposed from Windows and there is no similar default package manager for the OS, and so it's a necessity... that's not the case on Unix systems. If something else in the app is using zlib, we shouldn't be using a different one. If there's a vulnerability in zlib and the admin of the system updates the zlib on the machine, we don't want them to also need to patch another unrelated binary. If the dev or admin has chosen a particular build of zlib for their own reasons, maybe even made their own improvements, we'd like to use whichever they've deemed most applicable. Etc. It's great that Intel's made improvements, and I would much rather see Intel push its improvements upstream into the main zlib, such that all platforms get the benefits, rather than us getting those benefits by effectively shipping our own. |
Ah, that makes sense. Thanks. |
I agree with @stephentoub. If |
Thanks for the feedback. Closing. |
Thanks, @bjjones. |
Just stumbled upon this trying to figure out why Windows CoreFX generates different output from every other platform. IMO consistent cross-platform results should be expected with .net core, and this PR (assuming it generates identical output to the Windows implementation) should be implemented for that reason. |
@jherby2k by output, you mean it's a perfectly valid output but just not byte for byte identical? Why is that important to your scenario? |
In my case I was trying to troubleshoot why ImageSharp was generating different PNG images on Windows and Linux, causing unit tests to fail. After confirming there is no problem with either file, I had to add diverging test paths based on the OS. Just wasted cycles, that's all. |
DeflateStream is used in png encoding/decoding and the compression output differences leads to different file sizes. In this case however, the Intel implementation leads to larger output so the Adler implementation would be preferred, which means the rejection of this PR gets my support. A png is encoded once but can be downloaded thousands of times.. bytes count. I see that the compression changes were noted in #5674 (comment) but dismissed as minor. In my opinion that was an oversight. |
I mean, if we were starting over I’d say reject the initial patch for the Windows version, but since it’s a part of the codebase now I imagine reverting is out of the question. Makes more sense to at least standardize, IMO, and the windows version is probably always going to be the most commonly used. The lost compression is negligible. What’s more important, processor time or network bandwidth? Depends on the consumer.
|
Byte-for-byte equality of compressed data is not something that we're generally concerned about. There are lots of reasons why an encoder can spit out bytes differently than another encoder; the important thing is that those data streams decode the same. What we are concerned about is compression ratio and time to compress (or decompress). In the case of zlib-intel it was determined that the benefit to time to compress was worth the very small decrease in compression ratio. This is in conflict with the principles of CompressionLevel.Optimal outlined in the documentation, but as you said the ratio hit was considered negligible at the time. That said, your concern about the inferior ratio is a valid one. I don't think removing zlib-intel as the default is the correct decision for most people, but I recognize that there isn't really an alternative for scenarios such as yours beyond switching compression algorithms or compression libraries (neither of which I'm happy with recommending as a permanent solution). Your issue could be solved long-term in a few ways:
|
This PR extends the optimizations from #5674 by statically linking Zlib-Intel to System.IO.Compression.Native on x86/x64 platforms running Linux. Testing on an Intel i5-4690 with Ubuntu 14.04, I see for an average file in the Canterbury Corpus:
@stephentoub @ianhays