Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

[Fix][Text Generation Pipeline] Fix the erroneous sampling logic#1406

Merged
dbogunowicz merged 1 commit into
mainfrom
fix/damian/sampling
Nov 15, 2023
Merged

[Fix][Text Generation Pipeline] Fix the erroneous sampling logic#1406
dbogunowicz merged 1 commit into
mainfrom
fix/damian/sampling

Conversation

@dbogunowicz

@dbogunowicz dbogunowicz commented Nov 15, 2023

Copy link
Copy Markdown
Contributor

Fix Description

Before: regardless of whether sampling=True or False we would do top_k and top_p sampling.
Now: if sampling=False, we directly "jump" to the argmax function and avoid any sampling logic.

@horheynm Could you please validate the rest of the logic in def generate(self, logits: numpy.ndarray)? In the most complex scenario, we can apply both top_k, top_p, and sampling_temperature sequentially to our logits. Let's make sure that the order in which the sampling functions are applied matches the one defined in HF (I assume this is the original implementation that we want to mimic).

Comment thread src/deepsparse/transformers/utils/token_generator.py
@rahul-tuli

Copy link
Copy Markdown
Member

Could you add output before and after?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants