-
Couldn't load subscription status.
- Fork 12
Open
Description
- For GPT2 implementation, are the evaluation samples are being taken from the training dataset?
- Regarding the paper's quantitative comparison (Table 2), is the benchmark there also the training dataset for the models?
- How do the models perform for out of distribution samples (i.e. prompts that are not from the dataset)?
Metadata
Metadata
Assignees
Labels
No labels