Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

bjjones
Copy link
Contributor

@bjjones bjjones commented Mar 14, 2016

This PR extends the optimizations from #5674 by statically linking Zlib-Intel to System.IO.Compression.Native on x86/x64 platforms running Linux. Testing on an Intel i5-4690 with Ubuntu 14.04, I see for an average file in the Canterbury Corpus:

  • +24% performance improvement for CompressionLevel.Optimal
  • +9% performance improvement for CompressionLevel.Fastest
  • Negligible/Unchanged performance for CompressionLevel.NoCompression
Iterations File CompressionLevel Adler Intel Intel/Adler
50 alice29.txt Optimal 6.731896 4.995399 74.20%
50 cp.html Optimal 0.549348 0.502904 91.55%
50 fields.c Optimal 0.263334 0.253578 96.30%
50 lcet10.txt Optimal 18.31165 13.59124 74.22%
50 asyoulik.txt Optimal 6.086299 4.28945 70.48%
50 kennedy.xls Optimal 28.88515 12.32973 42.69%
50 sum Optimal 1.4116 0.8322659 58.96%
50 ptt5 Optimal 8.719625 5.636329 64.64%
50 grammar.lsp Optimal 0.12538 0.1282 102.25%
50 plrabn12.txt Optimal 27.5093 18.22937 66.27%
50 xargs.1 Optimal 0.145356 0.146492 100.78%
50 alice29.txt Fastest 1.888226 1.792558 94.93%
50 cp.html Fastest 0.310862 0.293024 94.26%
50 fields.c Fastest 0.16198 0.160036 98.80%
50 lcet10.txt Fastest 5.153471 4.703264 91.26%
50 asyoulik.txt Fastest 1.611514 1.586706 98.46%
50 kennedy.xls Fastest 7.43489 5.414171 72.82%
50 sum Fastest 0.513908 0.4540399 88.35%
50 ptt5 Fastest 2.982993 1.916446 64.25%
50 grammar.lsp Fastest 0.09776202 0.097928 100.17%
50 plrabn12.txt Fastest 6.750174 6.123339 90.71%
50 xargs.1 Fastest 0.112222 0.119746 106.70%

@stephentoub @ianhays

@stephentoub
Copy link
Member

It's great that there are perf improvements from this, but at least for right now I would prefer we not do this. We should use whatever zlib the user has on their system, either having built/installed explicitly or gotten from their package manager.

cc'ing a few folks for their opinions in case they differ from mine:
cc: @ianhays, @joshfree, @bartonjs, @morganbr, @jkotas

@bjjones
Copy link
Contributor Author

bjjones commented Mar 14, 2016

Thanks for the feedback @stephentoub . Could you elaborate a bit more on why you prefer using the system provided library? I would like to see the architectural features being used by the Windows build to be at parity with the Linux build if possible.

It sounds like you may be open to this PR in the future?

@stephentoub
Copy link
Member

In general we want to use the libraries that are standard to that platform. We only ship zlib as part of System.IO.Compression on Windows because there isn't one built into / exposed from Windows and there is no similar default package manager for the OS, and so it's a necessity... that's not the case on Unix systems. If something else in the app is using zlib, we shouldn't be using a different one. If there's a vulnerability in zlib and the admin of the system updates the zlib on the machine, we don't want them to also need to patch another unrelated binary. If the dev or admin has chosen a particular build of zlib for their own reasons, maybe even made their own improvements, we'd like to use whichever they've deemed most applicable. Etc. It's great that Intel's made improvements, and I would much rather see Intel push its improvements upstream into the main zlib, such that all platforms get the benefits, rather than us getting those benefits by effectively shipping our own.

@bjjones
Copy link
Contributor Author

bjjones commented Mar 14, 2016

Ah, that makes sense. Thanks.

@bartonjs
Copy link
Member

I agree with @stephentoub. If zlib-intel gets picked up as a zlib-compatible package on various distros then users would be able to switch to it and gain whatever benefits might be had in their native applications (e.g. Apache) as well as with .NET Core.

@bjjones
Copy link
Contributor Author

bjjones commented Mar 15, 2016

Thanks for the feedback. Closing.

@bjjones bjjones closed this Mar 15, 2016
@stephentoub
Copy link
Member

Thanks, @bjjones.

@jherby2k
Copy link

Just stumbled upon this trying to figure out why Windows CoreFX generates different output from every other platform. IMO consistent cross-platform results should be expected with .net core, and this PR (assuming it generates identical output to the Windows implementation) should be implemented for that reason.

@danmoseley
Copy link
Member

@jherby2k by output, you mean it's a perfectly valid output but just not byte for byte identical? Why is that important to your scenario?

@jherby2k
Copy link

In my case I was trying to troubleshoot why ImageSharp was generating different PNG images on Windows and Linux, causing unit tests to fail. After confirming there is no problem with either file, I had to add diverging test paths based on the OS. Just wasted cycles, that's all.

@JimBobSquarePants
Copy link

JimBobSquarePants commented Sep 14, 2018

DeflateStream is used in png encoding/decoding and the compression output differences leads to different file sizes. In this case however, the Intel implementation leads to larger output so the Adler implementation would be preferred, which means the rejection of this PR gets my support. A png is encoded once but can be downloaded thousands of times.. bytes count.

I see that the compression changes were noted in #5674 (comment) but dismissed as minor. In my opinion that was an oversight.

@jherby2k
Copy link

jherby2k commented Sep 14, 2018 via email

@ianhays
Copy link
Contributor

ianhays commented Sep 28, 2018

Byte-for-byte equality of compressed data is not something that we're generally concerned about. There are lots of reasons why an encoder can spit out bytes differently than another encoder; the important thing is that those data streams decode the same.

What we are concerned about is compression ratio and time to compress (or decompress). In the case of zlib-intel it was determined that the benefit to time to compress was worth the very small decrease in compression ratio. This is in conflict with the principles of CompressionLevel.Optimal outlined in the documentation, but as you said the ratio hit was considered negligible at the time.

That said, your concern about the inferior ratio is a valid one. I don't think removing zlib-intel as the default is the correct decision for most people, but I recognize that there isn't really an alternative for scenarios such as yours beyond switching compression algorithms or compression libraries (neither of which I'm happy with recommending as a permanent solution).

Your issue could be solved long-term in a few ways:

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants