-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Several fixes to make Completion.acreate(stream=True)
work
#172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The api will send chunks like ``` b'data: {"id": "cmpl-6W18L0k1kFoHUoSsJOwcPq7DKBaGX", "object": "text_completion", "created": 1673088873, "choices": [{"text": "_", "index": 0, "logprobs": null, "finish_reason": null}], "model": "ada"}\n\n' ``` The default iterator will break on each `\n` character, whereas iter_chunks will just output parts as they arrive
5cc1809
to
2e2e20e
Compare
…ssion is only closed once the response stream is finished Previously we'd exit the with statement before the response stream is consumed by the caller, therefore, unless we're using a global ClientSession, the session is closed (and thus the request) before it should be.
Completion.acreate(stream=True)
Completion.acreate(stream=True)
work with both local and global aiohttp session
Completion.acreate(stream=True)
work with both local and global aiohttp sessionCompletion.acreate(stream=True)
work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for fixing this! I left a couple of comments but otherwise this is looking great.
openai/api_requestor.py
Outdated
ctx = aiohttp_session() | ||
session = await ctx.__aenter__() | ||
result = await self.arequest_raw( | ||
method.lower(), | ||
url, | ||
session, | ||
params=params, | ||
supplied_headers=headers, | ||
files=files, | ||
request_id=request_id, | ||
request_timeout=request_timeout, | ||
) | ||
resp, got_stream = await self._interpret_async_response(result, stream) | ||
if got_stream: | ||
|
||
async def wrap_resp(): | ||
async for r in resp: | ||
yield r | ||
await ctx.__aexit__(None, None, None) | ||
|
||
return wrap_resp(), got_stream, self.api_key | ||
else: | ||
await ctx.__aexit__(None, None, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I first I thought it'd be easier to just fetch/create a ClientSession
here rather than getting the async generator and calling __aenter__
and __aexit__
manually but since we have to deal with manually closing one while being careful not to close the other, I think it's probably fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I think the context manager is still worth it for that encapsulation
openai/api_requestor.py
Outdated
async def wrap_resp(): | ||
async for r in resp: | ||
yield r | ||
await ctx.__aexit__(None, None, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's possible for this async generator to never complete (for example if the caller raises an exception before completing the iteration) in which case we'll never close this session, which I think raises an exception on the event loop?
Maybe we should create a session on the APIRequestor
instance instead 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I think should be good now in the latest commit
@@ -63,3 +64,26 @@ async def test_timeout_does_not_error(): | |||
model="ada", | |||
request_timeout=10, | |||
) | |||
|
|||
|
|||
async def test_completions_stream_finishes_global_session(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this test.
…ile consuming the stream
async for r in resp: | ||
yield r | ||
finally: | ||
await ctx.__aexit__(None, None, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might still not be called if the caller never actually iterates through the response and just drops it right?
It's probably fine for now to fix this bug but I imagine we'll want to scope the session to the requestor itself in the future so that we can always ensure that it's closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's an issue inherent to exposing an async iterator as an api here? If the caller doesn't consume it then all sorts of bad things may happen... The only solution I can think of is to add some cleanup with a timeout, but sounds a bit invasive? Another option would be to make this a bit fake, and fully consume the iterator ourselves and buffer it all in memory, but that partly defeats the purpose of asking for a stream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep that's fair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright let's merge this, thank you so much for fixing this bug!
) * Added a failing test case for async completion stream * Consume async generator with async for * Consume the stream in chunks as sent by API, to avoid "empty" parts The api will send chunks like ``` b'data: {"id": "cmpl-6W18L0k1kFoHUoSsJOwcPq7DKBaGX", "object": "text_completion", "created": 1673088873, "choices": [{"text": "_", "index": 0, "logprobs": null, "finish_reason": null}], "model": "ada"}\n\n' ``` The default iterator will break on each `\n` character, whereas iter_chunks will just output parts as they arrive * Add another test using global aiosession * Manually consume aiohttp_session asyncontextmanager to ensure that session is only closed once the response stream is finished Previously we'd exit the with statement before the response stream is consumed by the caller, therefore, unless we're using a global ClientSession, the session is closed (and thus the request) before it should be. * Ensure we close the session even if the caller raises an exception while consuming the stream
#171
Note that as per the issue above, even after the two fixes below this test still fails, as the 2nd chunk of the stream never arrives