-
Notifications
You must be signed in to change notification settings - Fork 343
Api brotli changes #1621
Api brotli changes #1621
Conversation
|
@Vedin, |
| public void Dispose() | ||
| { | ||
| // BrotliNative.BrotliDecoderDestroyInstance(BrotliNativeState); | ||
| // BrotliNative.BrotliEncoderDestroyInstance(BrotliNativeState); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KrzysztofCwalina This 2 methods, which I mentioned above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are going to have to hold enough information that allows you to distinguish between whether you are doing compress or decompress, then call the correct de-allocation function.
|
|
||
| public void SetQuality(uint quality) | ||
| { | ||
| if (quality < MinQuality || quality > MaxQuality) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MinQuality is 0 and quality is an unsigned int. Quality will never be < MinQuality and hence this check is unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
|
|
||
| public void SetWindow(uint window) | ||
| { | ||
| if (window < MinWindowBits || window > MaxWindowBits) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (window - MinWindowBits > MaxWindowBits - MinWindowBits) saves an extra conditional branch.
See: #1616 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense, nice optimization
| } | ||
|
|
||
| internal static TransformationStatus Compress(ReadOnlySpan<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, int quality = DefaultQuality, int windowSize = DefaultWindowSize, BrotliEncoderMode encMode = BrotliEncoderMode.Generic) | ||
| public static TransformationStatus FlushEncoder(Span<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, ref State state, bool is_finished = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: argument name doesn't follow coding style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
| bytesConsumed = bytesWritten = 0; | ||
| BrotliEncoderOperation operation = is_finished ? BrotliEncoderOperation.Finish : BrotliEncoderOperation.Flush; | ||
| bytesConsumed = source.Length; | ||
| bytesWritten = destination.Length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should not be set to source.Length and destination.Length in the beginning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really it should be because we send this variables to the native library (where they means array size)
(see written and consumed)
| bytesWritten = (int)written; | ||
| return TransformationStatus.Done; | ||
| } | ||
| bytesWritten = destination.Length - bytesWritten; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious, why do we do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because bytesWritten after calling a native func is: "how many bytes have already free", and we convert it to how many bytes have already written
| return TransformationStatus.InvalidData; | ||
| }; | ||
| bytesConsumed = (int)consumed; | ||
| bytesWritten = (int)((nuint)destination.Length - written); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this set bytesWritten to 0 in the success/done case, if written == destination.Length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's true and it should works like this
| bool errorDetected = false; | ||
| bytesConsumed = source.Length; | ||
| bytesWritten = destination.Length; | ||
| BrotliDecoderResult LastDecoderResult = BrotliDecoderResult.NeedsMoreInput; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is NeedsMoreInput the starting state of LastDecoderResult?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because at start we haven't send anything for decompress, so we need more input
| } | ||
| else | ||
| { | ||
| endOfStream = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this bool necessary? The other two conditions return, so reaching here automatically implies endOfStream is true. No?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
endOfStream was deleted
|
|
||
| if (endOfStream && !BrotliNative.BrotliDecoderIsFinished(state.BrotliNativeState)) | ||
| { | ||
| errorDetected = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, I don't think we need this. You can just use !BrotliNative.BrotliDecoderIsFinished(state.BrotliNativeState)) at line 209/215. Let me know if I am mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rewrote the logic of detecting errors, but you are right
| var text = BrotliNative.BrotliDecoderErrorString(error); | ||
| throw new System.IO.IOException(text + BrotliEx.unableDecode); | ||
| } | ||
| if (endOfStream && !BrotliNative.BrotliDecoderIsFinished(state.BrotliNativeState) && LastDecoderResult == BrotliDecoderResult.NeedsMoreInput) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this will ever be true.
if (x && y && LastDecoderResult == BrotliDecoderResult.NeedsMoreInput) is false if LastDecoderResult == BrotliDecoderResult.NeedsMoreInput is false (regardless of what x and y are), and we only get here if that is the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, fixed it
| public void Dispose() | ||
| { | ||
| // BrotliNative.BrotliDecoderDestroyInstance(BrotliNativeState); | ||
| // BrotliNative.BrotliEncoderDestroyInstance(BrotliNativeState); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are going to have to hold enough information that allows you to distinguish between whether you are doing compress or decompress, then call the correct de-allocation function.
| // BrotliNative.BrotliEncoderDestroyInstance(BrotliNativeState); | ||
| } | ||
|
|
||
| public void InitializeDecoder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest a single initialize function with a flag that says whether they want compress or decompress. Also, I'd make the initialization protect against a second call. If you don't you will leak native memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might also consider making these internal. State is a struct, so have the default object be "uninitialized" then have the state initialize itself on first use. That makes the developer have to so less work and won't get it wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that in encoder mode developer also may want to set quality and window size, in this case He needs to have a state before first call. Possible solution is add another Compress method with quality and window size, but it doesn't look obvious for user to call one method at first iteration and other on next iterations, does it?
| _encoder = new Encoder(); | ||
| _encoder.SetQuality(); | ||
| _encoder.SetWindow(); | ||
| _state.InitializeEncoder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the stream case, I would say that an exception bubbling up is probably a good thing, unless you plan on wrapping and re-throwing.
| if (BrotliNative.BrotliEncoderIsFinished(_state.BrotliNativeState)) return; | ||
| _nextInput = new byte[0]; | ||
| _availableInput = 0; | ||
| TransformationStatus ts = TransformationStatus.DestinationTooSmall; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename ts to something more meaningful. Even "status" would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to flushStatus
| } | ||
| else | ||
| { | ||
| throw new UnauthorizedAccessException(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That exception is used for security checks. It is not appropriate here.
| } | ||
| copyLen = bytesRemain > _bufferSize ? _bufferSize : bytesRemain; | ||
| Marshal.Copy(buffer, currentOffset, _bufferInput, copyLen); | ||
| byte[] bufferInput = new byte[copyLen]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this extra buffer allocation and copy necessary? If you are going to need an extra buffer on each write call, you might want to rent a buffer from a pool, have your own buffer that you re-use, or use a stack buffer (if it isn't too large).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary It is temporary. What the problem is that signature of Compress takes out parameter (consumed) and input is exactly as source size. I can change consumed to ref parameter, but it is inconsistent with other Encode/Decode methods, which using Span.
May be I can fix it using Span.Slice and just cut the buffer instead of copy it to _bufferInput?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shiftylogic Is something like this ok?
Span<byte> bufferInput = new Span<byte>(buffer);
bufferInput.Slice(0, copyLen);
transformationResult = Brotli.Compress(bufferInput, _bufferOutput, out _availableInput, out _availableOutput, ref _state);
| BrotliEncoderOperation operation = isFinished ? BrotliEncoderOperation.Finish : BrotliEncoderOperation.Flush; | ||
| bytesConsumed = source.Length; | ||
| bytesWritten = destination.Length; | ||
| bytesConsumed = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why set bytesConsumed to source.Length first and then to 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted
| TransformationStatus transformationResult = TransformationStatus.DestinationTooSmall; | ||
| while (transformationResult == TransformationStatus.DestinationTooSmall) | ||
| /*Span<byte> bufferInput = new Span<byte>(buffer); | ||
| bufferInput.Slice(0, copyLen);*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove commented out code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a question to @shiftylogic Can I use smth like this instead of allocate and copy array?
| } | ||
|
|
||
| private void ValidateCompressedData(Span<byte> data, byte[] expected) | ||
| private void ValidateCompressedData(byte[] data, byte[] expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this change required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to understand that I send byte array not span, in fact it wasn't change anything
| byte[] data = File.ReadAllBytes(testFilePath); | ||
| var bytes = new byte[bufferSize]; | ||
| foreach (var iteration in Benchmark.Iterations) | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, this filename has a typo:
BrotliPerfomanceTests.cs => BrotliPerformanceTests.cs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow, really
| { | ||
| internal IntPtr BrotliNativeState { get; private set; } | ||
| internal BrotliDecoderResult LastDecoderResult; | ||
| public bool CompressMode { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be readonly and set internally when InitializeDecoder or InitializeEncoder is called.
| _availableInput = (nuint)copyLen; | ||
| _nextInput = _bufferInput; | ||
| while ((int)_availableInput > 0) | ||
| byte[] bufferInput = new byte[copyLen]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this extra array and copy needed? Why can't you use the input directly for the compress call?
| while (true) | ||
| if (_state.BrotliNativeState == IntPtr.Zero) return; | ||
| if (BrotliNative.BrotliEncoderIsFinished(_state.BrotliNativeState)) return; | ||
| _nextInput = new byte[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why allocate an empty array and throw away the one you allocated in the constructor?
| public static TransformationStatus Compress(ReadOnlySpan<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, ref State state) | ||
| { | ||
| EnsureInitialized(ref state, true); | ||
| try |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This try catch and throw is not necessary if all you are doing it re-throwing the same exception. Just let it bubble up to the caller.
| public static TransformationStatus Decompress(ReadOnlySpan<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, ref State state) | ||
| { | ||
| EnsureInitialized(ref state, false); | ||
| try |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Just let this bubble up to the caller.
| } | ||
| } | ||
|
|
||
| public static void EnsureInitialized(ref State state, bool compress) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want this internal, unless you think someone needs this for external use.
| } | ||
|
|
||
| public static TransformationStatus Compress(ReadOnlySpan<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, CompressionLevel quality = (CompressionLevel)DefaultQuality, int windowSize = DefaultWindowSize, BrotliEncoderMode encMode = BrotliEncoderMode.Generic) | ||
| public static TransformationStatus FlushEncoder(Span<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, ref State state, bool isFinished = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change the input source to Span instead of ReadOnlySpan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is different methods, but yes their should be ReadOnlySpan also
| nuint availableOutput = (nuint)bytesWritten; | ||
| nuint consumed = (nuint)bytesConsumed; | ||
| state.LastDecoderResult = BrotliNative.BrotliDecoderDecompressStream(state.BrotliNativeState, ref consumed, ref bufIn, ref availableOutput, ref bufOut, out nuint totalOut); | ||
| bytesWritten = (int)((nuint)destination.Length - availableOutput); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is simpler. Would it work?
bytesWritten = destination.Length - (int)availableOutput;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I should change it, I just think that it is better to do arithmetic operations in the narrowed type, than expand it to int and than do minus
| public partial class BrotliStream : Stream | ||
| { | ||
| private const int DefaultBufferSize = (1 << 16) - 1; | ||
| private const int DefaultBufferSize = (1 << 16) - 16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the decimal value of DefaultBufferSize as a comment?
| throw new System.IO.IOException(BrotliEx.unableEncode); | ||
| var extraData = (nuint)_availableOutput != (nuint)_bufferSize; | ||
| if (extraData) | ||
| flushStatus = Brotli.FlushEncoder(Array.Empty<byte>(), _buffer, out _availableInput, out _availableOutput, ref _state, finished); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why do we pass an empty array here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because, when we start flush stream, we shouldn't send any more date to the original native lib
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not Span.Empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm...Does Span.Empty work faster? I just thought it is the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm...Does Span.Empty work faster? I just thought it is the same
I am not certain (will have to look at assembly). However, FlushEncoder takes a span, and if Span.Empty works, I would use that. If there is a specific reason to keep Array.Empty instead and use the implicit cast to Span, then use array. I believe you avoid the implicit cast: https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/System/Span.cs#L376
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Span.Empty should be faster because it avoids the implicit cast and thus the extra ctor call.
* Optimizing some int32 parsers and clean up * wip - adding tests and fixing impl bugs * WIP - new implementation and tests to compare with previous * WIP - fixing non-invariant parser bugs and adding tests * Cleaning up functional and performance tests and removing old code. * Removing unnecessary using directive * Addressing PR comments and adding more tests. * Addressing PR comments and adding loop unrolling. * Fixing issues missed after merge * Fixing non-invariant int32 parser * Addressing PR comment, removing unused DangerousGetPinnableReference * Cleanup and updating test helpers * Patch to dotnet install scripts for downloading via new blob URL. (#1632) * Removing text.Length cache and changing to unsigned comparison * Adding comment and removing unnecessary math operations using 0 (D0) * Api brotli changes (#1621) * CaterburyPerf * flush/compress * huge changes * space * small * flush/compress * huge changes * space * small * unsaved chagnes * test fix depends on APIchanges * resolve issues * Less alocation, change State * issue * remove unnecessary * resolve * clean up changes, change state * resolve * guidelines * fix bug * resolve issues * change access * small issues
Changes to the Brotli API. What's new:
Brotli (Primitives) was rewrote and now return all possible TransformationStatus.
BroltiStream now base on Brotli class. Also it uses twice less memory as before and doesn't use marshaling.
@KrzysztofCwalina Finally, I remove everything from State and rewrite all in way where we need only BrotliNativeState and LastDecoderResult, So how to Dispose it know (there are 2 commented methods for compression/decompression mode? May be add some flag to detect in what mode state now?
BrotliPrimitivesTests needs an update.