Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

[Text Generation][KVCacheStorage] TextGenerationPipeline refactor#1254

Merged
bfineran merged 12 commits into
mainfrom
feature/damian/chat_pipeline
Sep 21, 2023
Merged

[Text Generation][KVCacheStorage] TextGenerationPipeline refactor#1254
bfineran merged 12 commits into
mainfrom
feature/damian/chat_pipeline

Conversation

@dbogunowicz

@dbogunowicz dbogunowicz commented Sep 19, 2023

Copy link
Copy Markdown
Contributor

As agreed with the team, the old design for KVCacheSessionStorage was ugly, given the series of recent refactors.
The goal of this PR is to decouple the DecoderKVCache from the NLDecoderEngine. This will allow us to implement the upcoming ChatPipeline(TextGenerationPipeline) much more cleanly.

This is roughly the design envisioned:

image
Testing:

  • successfully ran the LLM testing suite in test_text_generation.py
  • note: did not run eval_downstream. Currently, this pathway is broken. Afaik this is in agreement with @alexm-nm and @dsikka, who are working on landing the eval_downstream refactor.

streamer.end()

if self._debug:
self._debug = dict(kv_cache=session)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

purely for testing purposes

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even for debug we won't want to update state at runtime, we should attach this to the returned output schema if possible (does not need to be part of the schema)

@dbogunowicz dbogunowicz changed the title Feature/damian/chat pipeline [Text Generation][KVCacheStorage] TextGenerationPipeline refactor Sep 20, 2023
@dbogunowicz dbogunowicz marked this pull request as ready for review September 20, 2023 12:36

@rahul-tuli rahul-tuli left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refactor looks much nicer than original code! GG!

Comment thread src/deepsparse/transformers/pipelines/text_generation.py
Comment thread src/deepsparse/transformers/utils/helpers.py
Comment thread src/deepsparse/transformers/engines/nl_decoder_engine.py Outdated
Comment thread src/deepsparse/transformers/engines/nl_decoder_engine.py Outdated
Comment thread src/deepsparse/transformers/pipelines/text_generation.py Outdated
Comment thread src/deepsparse/transformers/pipelines/text_generation.py Outdated
Comment thread src/deepsparse/transformers/pipelines/text_generation.py Outdated
streamer.end()

if self._debug:
self._debug = dict(kv_cache=session)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even for debug we won't want to update state at runtime, we should attach this to the returned output schema if possible (does not need to be part of the schema)

Comment thread src/deepsparse/transformers/pipelines/text_generation.py
Comment thread src/deepsparse/transformers/utils/helpers.py
Comment thread src/deepsparse/transformers/engines/nl_decoder_engine.py
Comment thread src/deepsparse/transformers/utils/decoder_kv_cache.py
Comment thread src/deepsparse/transformers/engines/nl_decoder_engine.py Outdated
Comment thread src/deepsparse/transformers/engines/nl_decoder_engine.py

@rahul-tuli rahul-tuli left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a few nits!

Comment thread src/deepsparse/transformers/engines/nl_decoder_engine.py
Comment thread src/deepsparse/transformers/engines/nl_decoder_engine.py
Comment thread src/deepsparse/transformers/engines/nl_decoder_engine.py
Comment thread src/deepsparse/transformers/utils/decoder_kv_cache.py
@bfineran

Copy link
Copy Markdown
Contributor

failures look unrelated - merging

@bfineran bfineran merged commit fdb5d44 into main Sep 21, 2023
@bfineran bfineran deleted the feature/damian/chat_pipeline branch September 21, 2023 17:55
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants