fix performance issue with ResponseStream wrapper #6937

thrau · 2022-09-26T16:03:05Z

This PR fixes a performance issue that occurred with AWS responses that have streaming bodies. The change exposes the underlying iterator of Response directly through the _ResponseStream wrapper, so invocations of the form for item in stream no longer go through the pointless read_into abstraction.

Detailed explanation

This issues surfaced with the new S3 provider, where get_object download would take exponentially longer the larger the body. When bodies are streamed, we wrap the response in a _ResponseStream object that exposes the response as a readable IO object

localstack/localstack/aws/client.py

Line 150 in 68c507e

response_dict["body"] = _ResponseStream(response)

The issue is that the _ResponseStream is invoked in the form for item in stream. Through the RawIOBase abstraction, this creates an iterator by looping over calls to read_into with a byte buffer of size 1.
The culprit is this line in particular:

localstack/localstack/aws/client.py

Line 37 in 68c507e

output, self._buf = chunk[:upto], chunk[upto:]

Since the underlying response is one big blob in memory, chunk will actually contain the entire s3 object. Consequently, we're always popping a single byte off the entire chunk to populate the response, and copying the remaining buffer. This will take longer depending on how big chunk is, which is the exponential runtime behavior we were seeing.

After the fix, read_into might actually be dead code now, since the underlying webserver always consumes the response stream using the for data in stream pattern. However, read_into is still necessary for the RawIOBase implementation, so I added tests to cover it.

Reproduce

Here's a simple benchmark. On my machine this takes

1.3 seconds without the fix
0.00016 seconds with the fix

import timeit

from werkzeug.wrappers.response import _iter_encoded, Response

from localstack.aws.client import _ResponseStream

n_bytes = 200000


def process_stream():
    b = b'0' * n_bytes
    stream = _ResponseStream(Response(b))

    for line in _iter_encoded(stream, 'utf-8'):
        assert line


def main():
    print(timeit.timeit(stmt=process_stream, number=1))


if __name__ == '__main__':
    main()

coveralls · 2022-09-26T16:49:07Z

Coverage increased (+0.006%) to 79.823% when pulling 44fdf98 on fix-response-stream-performance into 68c507e on master.

github-actions · 2022-09-26T17:16:42Z

LocalStack integration with Pro

      3 files +    2       3 suites +2 1h 10m 32s ⏱️ + 13m 47s
1 313 tests +  16 1 198 ✔️ +  27 115 💤 -   11 0 ❌ ±0
1 825 runs +528 1 566 ✔️ +395 259 💤 +133 0 ❌ ±0

Results for commit 44fdf98. ± Comparison against base commit 68c507e.

bentsku

LGTM! 🚀 Thanks a lot for fixing this deep bug, great catch. Super excited that we can soon see the results of the S3 migration. Awesome! 👍

fix performance issue with ResponseStream wrapper

44fdf98

thrau requested a review from bentsku September 26, 2022 16:03

thrau temporarily deployed to localstack-ext-tests September 26, 2022 16:03 Inactive

bentsku approved these changes Sep 26, 2022

View reviewed changes

bentsku mentioned this pull request Sep 26, 2022

S3 migration - set s3 v2 provider as default #6827

Merged

21 tasks

thrau merged commit 6a734e2 into master Sep 26, 2022

thrau deleted the fix-response-stream-performance branch September 26, 2022 21:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix performance issue with ResponseStream wrapper #6937

fix performance issue with ResponseStream wrapper #6937

Uh oh!

thrau commented Sep 26, 2022 •

edited

Loading

Uh oh!

coveralls commented Sep 26, 2022

Uh oh!

github-actions bot commented Sep 26, 2022

Uh oh!

bentsku left a comment

Uh oh!

Uh oh!

Uh oh!

fix performance issue with ResponseStream wrapper #6937

fix performance issue with ResponseStream wrapper #6937

Uh oh!

Conversation

thrau commented Sep 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Detailed explanation

Reproduce

Uh oh!

coveralls commented Sep 26, 2022

Uh oh!

github-actions bot commented Sep 26, 2022

LocalStack integration with Pro

Uh oh!

bentsku left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thrau commented Sep 26, 2022 •

edited

Loading