Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Aug 2, 2023. It is now read-only.

Conversation

Vedin
Copy link
Contributor

@Vedin Vedin commented Jun 21, 2017

Changes to the Brotli API. What's new:

  1. Brotli (Primitives) was rewrote and now return all possible TransformationStatus.

  2. BroltiStream now base on Brotli class. Also it uses twice less memory as before and doesn't use marshaling.

  3. @KrzysztofCwalina Finally, I remove everything from State and rewrite all in way where we need only BrotliNativeState and LastDecoderResult, So how to Dispose it know (there are 2 commented methods for compression/decompression mode? May be add some flag to detect in what mode state now?

  4. BrotliPrimitivesTests needs an update.

@dnfclas
Copy link

dnfclas commented Jun 21, 2017

@Vedin,
Thanks for having already signed the Contribution License Agreement. Your agreement was validated by .NET Foundation. We will now review your pull request.
Thanks,
.NET Foundation Pull Request Bot

public void Dispose()
{
// BrotliNative.BrotliDecoderDestroyInstance(BrotliNativeState);
// BrotliNative.BrotliEncoderDestroyInstance(BrotliNativeState);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KrzysztofCwalina This 2 methods, which I mentioned above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are going to have to hold enough information that allows you to distinguish between whether you are doing compress or decompress, then call the correct de-allocation function.


public void SetQuality(uint quality)
{
if (quality < MinQuality || quality > MaxQuality)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MinQuality is 0 and quality is an unsigned int. Quality will never be < MinQuality and hence this check is unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed


public void SetWindow(uint window)
{
if (window < MinWindowBits || window > MaxWindowBits)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (window - MinWindowBits > MaxWindowBits - MinWindowBits) saves an extra conditional branch.

See: #1616 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense, nice optimization

}

internal static TransformationStatus Compress(ReadOnlySpan<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, int quality = DefaultQuality, int windowSize = DefaultWindowSize, BrotliEncoderMode encMode = BrotliEncoderMode.Generic)
public static TransformationStatus FlushEncoder(Span<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, ref State state, bool is_finished = true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: argument name doesn't follow coding style

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

bytesConsumed = bytesWritten = 0;
BrotliEncoderOperation operation = is_finished ? BrotliEncoderOperation.Finish : BrotliEncoderOperation.Flush;
bytesConsumed = source.Length;
bytesWritten = destination.Length;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should not be set to source.Length and destination.Length in the beginning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really it should be because we send this variables to the native library (where they means array size)
(see written and consumed)

bytesWritten = (int)written;
return TransformationStatus.Done;
}
bytesWritten = destination.Length - bytesWritten;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious, why do we do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because bytesWritten after calling a native func is: "how many bytes have already free", and we convert it to how many bytes have already written

return TransformationStatus.InvalidData;
};
bytesConsumed = (int)consumed;
bytesWritten = (int)((nuint)destination.Length - written);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this set bytesWritten to 0 in the success/done case, if written == destination.Length?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's true and it should works like this

bool errorDetected = false;
bytesConsumed = source.Length;
bytesWritten = destination.Length;
BrotliDecoderResult LastDecoderResult = BrotliDecoderResult.NeedsMoreInput;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is NeedsMoreInput the starting state of LastDecoderResult?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because at start we haven't send anything for decompress, so we need more input

}
else
{
endOfStream = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this bool necessary? The other two conditions return, so reaching here automatically implies endOfStream is true. No?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

endOfStream was deleted


if (endOfStream && !BrotliNative.BrotliDecoderIsFinished(state.BrotliNativeState))
{
errorDetected = true;
Copy link
Contributor

@ahsonkhan ahsonkhan Jun 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, I don't think we need this. You can just use !BrotliNative.BrotliDecoderIsFinished(state.BrotliNativeState)) at line 209/215. Let me know if I am mistaken.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rewrote the logic of detecting errors, but you are right

var text = BrotliNative.BrotliDecoderErrorString(error);
throw new System.IO.IOException(text + BrotliEx.unableDecode);
}
if (endOfStream && !BrotliNative.BrotliDecoderIsFinished(state.BrotliNativeState) && LastDecoderResult == BrotliDecoderResult.NeedsMoreInput)
Copy link
Contributor

@ahsonkhan ahsonkhan Jun 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this will ever be true.
if (x && y && LastDecoderResult == BrotliDecoderResult.NeedsMoreInput) is false if LastDecoderResult == BrotliDecoderResult.NeedsMoreInput is false (regardless of what x and y are), and we only get here if that is the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, fixed it

public void Dispose()
{
// BrotliNative.BrotliDecoderDestroyInstance(BrotliNativeState);
// BrotliNative.BrotliEncoderDestroyInstance(BrotliNativeState);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are going to have to hold enough information that allows you to distinguish between whether you are doing compress or decompress, then call the correct de-allocation function.

// BrotliNative.BrotliEncoderDestroyInstance(BrotliNativeState);
}

public void InitializeDecoder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest a single initialize function with a flag that says whether they want compress or decompress. Also, I'd make the initialization protect against a second call. If you don't you will leak native memory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might also consider making these internal. State is a struct, so have the default object be "uninitialized" then have the state initialize itself on first use. That makes the developer have to so less work and won't get it wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that in encoder mode developer also may want to set quality and window size, in this case He needs to have a state before first call. Possible solution is add another Compress method with quality and window size, but it doesn't look obvious for user to call one method at first iteration and other on next iterations, does it?

_encoder = new Encoder();
_encoder.SetQuality();
_encoder.SetWindow();
_state.InitializeEncoder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the stream case, I would say that an exception bubbling up is probably a good thing, unless you plan on wrapping and re-throwing.

if (BrotliNative.BrotliEncoderIsFinished(_state.BrotliNativeState)) return;
_nextInput = new byte[0];
_availableInput = 0;
TransformationStatus ts = TransformationStatus.DestinationTooSmall;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename ts to something more meaningful. Even "status" would be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to flushStatus

}
else
{
throw new UnauthorizedAccessException();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That exception is used for security checks. It is not appropriate here.

}
copyLen = bytesRemain > _bufferSize ? _bufferSize : bytesRemain;
Marshal.Copy(buffer, currentOffset, _bufferInput, copyLen);
byte[] bufferInput = new byte[copyLen];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this extra buffer allocation and copy necessary? If you are going to need an extra buffer on each write call, you might want to rent a buffer from a pool, have your own buffer that you re-use, or use a stack buffer (if it isn't too large).

Copy link
Contributor Author

@Vedin Vedin Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary It is temporary. What the problem is that signature of Compress takes out parameter (consumed) and input is exactly as source size. I can change consumed to ref parameter, but it is inconsistent with other Encode/Decode methods, which using Span.

May be I can fix it using Span.Slice and just cut the buffer instead of copy it to _bufferInput?

Copy link
Contributor Author

@Vedin Vedin Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shiftylogic Is something like this ok?

Span<byte> bufferInput = new Span<byte>(buffer);
bufferInput.Slice(0, copyLen);
transformationResult = Brotli.Compress(bufferInput, _bufferOutput, out _availableInput, out _availableOutput, ref _state);

BrotliEncoderOperation operation = isFinished ? BrotliEncoderOperation.Finish : BrotliEncoderOperation.Flush;
bytesConsumed = source.Length;
bytesWritten = destination.Length;
bytesConsumed = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why set bytesConsumed to source.Length first and then to 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted

TransformationStatus transformationResult = TransformationStatus.DestinationTooSmall;
while (transformationResult == TransformationStatus.DestinationTooSmall)
/*Span<byte> bufferInput = new Span<byte>(buffer);
bufferInput.Slice(0, copyLen);*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove commented out code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a question to @shiftylogic Can I use smth like this instead of allocate and copy array?

}

private void ValidateCompressedData(Span<byte> data, byte[] expected)
private void ValidateCompressedData(byte[] data, byte[] expected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this change required?

Copy link
Contributor Author

@Vedin Vedin Jun 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to understand that I send byte array not span, in fact it wasn't change anything

byte[] data = File.ReadAllBytes(testFilePath);
var bytes = new byte[bufferSize];
foreach (var iteration in Benchmark.Iterations)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this filename has a typo:
BrotliPerfomanceTests.cs => BrotliPerformanceTests.cs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, really

{
internal IntPtr BrotliNativeState { get; private set; }
internal BrotliDecoderResult LastDecoderResult;
public bool CompressMode { get; set; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be readonly and set internally when InitializeDecoder or InitializeEncoder is called.

_availableInput = (nuint)copyLen;
_nextInput = _bufferInput;
while ((int)_availableInput > 0)
byte[] bufferInput = new byte[copyLen];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this extra array and copy needed? Why can't you use the input directly for the compress call?

while (true)
if (_state.BrotliNativeState == IntPtr.Zero) return;
if (BrotliNative.BrotliEncoderIsFinished(_state.BrotliNativeState)) return;
_nextInput = new byte[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why allocate an empty array and throw away the one you allocated in the constructor?

public static TransformationStatus Compress(ReadOnlySpan<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, ref State state)
{
EnsureInitialized(ref state, true);
try
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This try catch and throw is not necessary if all you are doing it re-throwing the same exception. Just let it bubble up to the caller.

public static TransformationStatus Decompress(ReadOnlySpan<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, ref State state)
{
EnsureInitialized(ref state, false);
try
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Just let this bubble up to the caller.

}
}

public static void EnsureInitialized(ref State state, bool compress)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably want this internal, unless you think someone needs this for external use.

}

public static TransformationStatus Compress(ReadOnlySpan<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, CompressionLevel quality = (CompressionLevel)DefaultQuality, int windowSize = DefaultWindowSize, BrotliEncoderMode encMode = BrotliEncoderMode.Generic)
public static TransformationStatus FlushEncoder(Span<byte> source, Span<byte> destination, out int bytesConsumed, out int bytesWritten, ref State state, bool isFinished = true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change the input source to Span instead of ReadOnlySpan?

Copy link
Contributor Author

@Vedin Vedin Jun 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is different methods, but yes their should be ReadOnlySpan also

nuint availableOutput = (nuint)bytesWritten;
nuint consumed = (nuint)bytesConsumed;
state.LastDecoderResult = BrotliNative.BrotliDecoderDecompressStream(state.BrotliNativeState, ref consumed, ref bufIn, ref availableOutput, ref bufOut, out nuint totalOut);
bytesWritten = (int)((nuint)destination.Length - availableOutput);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is simpler. Would it work?
bytesWritten = destination.Length - (int)availableOutput;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I should change it, I just think that it is better to do arithmetic operations in the narrowed type, than expand it to int and than do minus

public partial class BrotliStream : Stream
{
private const int DefaultBufferSize = (1 << 16) - 1;
private const int DefaultBufferSize = (1 << 16) - 16;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the decimal value of DefaultBufferSize as a comment?

throw new System.IO.IOException(BrotliEx.unableEncode);
var extraData = (nuint)_availableOutput != (nuint)_bufferSize;
if (extraData)
flushStatus = Brotli.FlushEncoder(Array.Empty<byte>(), _buffer, out _availableInput, out _availableOutput, ref _state, finished);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why do we pass an empty array here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because, when we start flush stream, we shouldn't send any more date to the original native lib

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not Span.Empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm...Does Span.Empty work faster? I just thought it is the same

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm...Does Span.Empty work faster? I just thought it is the same

I am not certain (will have to look at assembly). However, FlushEncoder takes a span, and if Span.Empty works, I would use that. If there is a specific reason to keep Array.Empty instead and use the implicit cast to Span, then use array. I believe you avoid the implicit cast: https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/System/Span.cs#L376

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Span.Empty should be faster because it avoids the implicit cast and thus the extra ctor call.

@Vedin Vedin merged commit 1a329f4 into dotnet:master Jun 27, 2017
ahsonkhan added a commit that referenced this pull request Jun 27, 2017
* Optimizing some int32 parsers and clean up

* wip - adding tests and fixing impl bugs

* WIP - new implementation and tests to compare with previous

* WIP - fixing non-invariant parser bugs and adding tests

* Cleaning up functional and performance tests and removing old code.

* Removing unnecessary using directive

* Addressing PR comments and adding more tests.

* Addressing PR comments and adding loop unrolling.

* Fixing issues missed after merge

* Fixing non-invariant int32 parser

* Addressing PR comment, removing unused DangerousGetPinnableReference

* Cleanup and updating test helpers

* Patch to dotnet install scripts for downloading via new blob URL. (#1632)

* Removing text.Length cache and changing to unsigned comparison

* Adding comment and removing unnecessary math operations using 0 (D0)

* Api brotli changes (#1621)

* CaterburyPerf

* flush/compress

* huge changes

* space

* small

* flush/compress

* huge changes

* space

* small

* unsaved chagnes

* test fix depends on APIchanges

* resolve issues

* Less alocation, change State

* issue

* remove unnecessary

* resolve

* clean up changes, change state

* resolve

* guidelines

* fix bug

* resolve issues

* change access

* small issues
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants