made ascii string encoding faster #101777

gaaclarke · 2022-04-12T19:56:01Z

In local testing this made the StandardMessageCodec_string benchmark go from 0.51338 µs to 0.34857µs (33% decrease).

Don't land before #101767

Test coverage already exists, this is just a performance change.

Pre-launch Checklist

I read the Contributor Guide and followed the process outlined there for submitting PRs.
I read the Tree Hygiene wiki page, which explains my responsibilities.
I read and followed the Flutter Style Guide, including Features we expect every widget to implement.
I signed the CLA.
I listed at least one issue that this PR fixes in the description above.
I updated/added relevant documentation (doc comments with ///).
I added new tests to check the change I am making, or this PR is test-exempt.
All existing and new tests are passing.

If you need help, consider asking for advice on the #hackers-new channel on Discord.

flutter-dashboard · 2022-04-12T20:32:51Z

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat (don't just cc him here, he won't see it! He's on Discord!).

If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix?

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing.

jonahwilliams · 2022-04-12T20:38:53Z

packages/flutter/lib/src/services/message_codecs.dart

+      final Uint8List asciiBytes = Uint8List(value.length);
+      Uint8List? utf8Bytes;
+      int utf8Offset = 0;
+      // Only do utf8 encoding if we encounter non-ascii characters.


I though dart utf8 encoding already had a fast path for this?

Yep, but here are the reasons I believe we are getting faster results:

We are removing an extra copy of the data: https://github.com/dart-lang/sdk/blob/56035a7df0bb26a6babce53fae21d46263f3bf26/sdk/lib/convert/utf.dart#L115

It inlines the logic

We are removing bounds checks https://github.com/dart-lang/sdk/blob/56035a7df0bb26a6babce53fae21d46263f3bf26/sdk/lib/convert/utf.dart#L98

Here's another bounds check we were able to remove: https://github.com/dart-lang/sdk/blob/56035a7df0bb26a6babce53fae21d46263f3bf26/sdk/lib/convert/utf.dart#L201

Dart's UTF8 encoder main loop is: https://github.com/dart-lang/sdk/blob/main/sdk/lib/convert/utf.dart#L197

For strings containing only ASCII Dart should just take each code unit and copy it into the output byte buffer, which is similar to what this loop is doing. I'm not sure why there would be a noticeable performance difference between Dart's encoder and this PR.

Ahh I see. Darts encoder guarantees that the Uint8List it gives you is not just a view of some larger buffer somewhere else, but we don't need to worry about that since we're immediately writing this into another one.

Makes sense!

@jason-simmons that sublist at the end is copying the data though, that might be the biggest difference?

That makes sense - the typed data sublist is doing a memcpy. But this encoder can avoid that by writing the ASCII part and the non-ASCII part to the WriteBuffer as two separate chunks.

jason-simmons · 2022-04-12T22:18:25Z

Dart also provides Utf8Encoder.startChunkedConversion, which sends each chunk of UTF-8 byte data to a sink interface without doing a memcpy.

However, that probably won't work for StandardMessageCodec because StandardMessageCodec's output format writes the length of the UTF-8 string before the string data. So there is no way to avoid accumulating all the data into an intermediate buffer before writing it to StandardMessageCodec's output stream.

jonahwilliams · 2022-04-12T22:22:44Z

In theory you could arrange things such that we accumulate the utf8 bits after leaving a spot for a length, while measuring the length, then go back and write it in.

jonahwilliams · 2022-04-12T22:23:55Z

packages/flutter/lib/src/services/message_codecs.dart

+      Uint8List? utf8Bytes;
+      int utf8Offset = 0;
+      // Only do utf8 encoding if we encounter non-ascii characters.
+      for(int i = 0; i < value.length; ++i) {


If we're going to do this ourselves we should have a wide variety of unit tests to ensure that we cover both ascii/utf8 sufficently to ensure that the data is not corrupted

We already have those tests written here: https://github.com/flutter/flutter/blob/e6f302289014371326e480b293779827da0c81d5/packages/flutter/test/services/message_codecs_test.dart#L213:L213

Do you feel like those are sufficient?

The new code has complete test coverage. Every line is exercised by a test. I can't imagine another test or input that would exercise it differently.

If you feel that is sufficient, then that is fine. I also don't think you need a test exemption since you updated the benchmark, right?

packages/flutter/lib/src/services/message_codecs.dart

gaaclarke · 2022-04-13T00:06:10Z

In theory you could arrange things such that we accumulate the utf8 bits after leaving a spot for a length, while measuring the length, then go back and write it in.

We are using variable width sizes so you can't know how much space to reserve, except you could probably just choose the max size (5 bytes). You'd have to double check that the decoders would support that.

packages/flutter/lib/src/services/message_codecs.dart

Co-authored-by: Jonah Williams <[email protected]>

jonahwilliams

LGTM

flutter-dashboard bot added the framework flutter/packages/flutter repository. See also f: labels. label Apr 12, 2022

gaaclarke force-pushed the faster-string-encoding branch 2 times, most recently from 6e661f0 to 95c6e0d Compare April 12, 2022 20:18

gaaclarke marked this pull request as ready for review April 12, 2022 20:32

jonahwilliams reviewed Apr 12, 2022

View reviewed changes

gaaclarke requested a review from jonahwilliams April 12, 2022 20:52

jonahwilliams reviewed Apr 12, 2022

View reviewed changes

packages/flutter/lib/src/services/message_codecs.dart Outdated Show resolved Hide resolved

jonahwilliams reviewed Apr 12, 2022

View reviewed changes

packages/flutter/lib/src/services/message_codecs.dart Outdated Show resolved Hide resolved

jonahwilliams reviewed Apr 13, 2022

View reviewed changes

packages/flutter/lib/src/services/message_codecs.dart Outdated Show resolved Hide resolved

gaaclarke and others added 3 commits April 12, 2022 17:11

made ascii string encoding faster

244a083

+=

78bff09

Update packages/flutter/lib/src/services/message_codecs.dart

6d5551d

Co-authored-by: Jonah Williams <[email protected]>

gaaclarke force-pushed the faster-string-encoding branch from a799b72 to 6d5551d Compare April 13, 2022 00:11

switched to a sublistview

3809a09

gaaclarke requested a review from jonahwilliams April 13, 2022 00:24

jonahwilliams approved these changes Apr 13, 2022

View reviewed changes

gaaclarke added the waiting for tree to go green label Apr 13, 2022

fluttergithubbot merged commit fd73f27 into flutter:master Apr 13, 2022

This was referenced Apr 13, 2022

Roll Flutter from 30a501801af7 to fd73f2730c78 (1 revision) flutter/packages#1508

Merged

Roll Flutter from 888208c1f4e0 to fd73f2730c78 (10 revisions) flutter/plugins#5246

Merged

engine-flutter-autoroll added a commit to engine-flutter-autoroll/packages that referenced this pull request Apr 13, 2022

fd73f27 made ascii string encoding faster (flutter/flutter#101777)

84967bc

engine-flutter-autoroll added a commit to engine-flutter-autoroll/plugins that referenced this pull request Apr 13, 2022

fd73f27 made ascii string encoding faster (flutter/flutter#101777)

25e1906

jonahwilliams mentioned this pull request Jul 22, 2022

Provide a way to encode Strings to an existing buffer dart-lang/sdk#49470

Open

made ascii string encoding faster #101777

made ascii string encoding faster #101777

Uh oh!

Conversation

gaaclarke commented Apr 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pre-launch Checklist

Uh oh!

flutter-dashboard bot commented Apr 12, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jason-simmons commented Apr 12, 2022

Uh oh!

jonahwilliams commented Apr 12, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gaaclarke commented Apr 13, 2022

Uh oh!

Uh oh!

jonahwilliams left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gaaclarke commented Apr 12, 2022 •

edited

Loading