Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

[Fix] QA Pipeline fails on SQuAD with seq_len=128#889

Merged
dbogunowicz merged 3 commits into
mainfrom
fix/damian/doc_stride
Jan 27, 2023
Merged

[Fix] QA Pipeline fails on SQuAD with seq_len=128#889
dbogunowicz merged 3 commits into
mainfrom
fix/damian/doc_stride

Conversation

@dbogunowicz

Copy link
Copy Markdown
Contributor

Fix for the: https://app.asana.com/0/1201735099598270/1203822912826533/f

This PR addresses an issue where the default value for the doc_stride argument was set too high.
According to the documentation for the transformers library, the value of doc_stride should be smaller than the difference between max_seq_length and the sum of the length of the truncated question and the number of special tokens (sequence_added_tokens).

doc_stride < max_seq_length - len(truncated_question) - sequence_added_tokens

Specifically, for a max_seq_length of 128, assuming not special tokens, the doc_stride should be less than the length of the question string. This PR reduces the value of doc_stride to align with this guideline.

@dbogunowicz dbogunowicz merged commit 6ba4f12 into main Jan 27, 2023
@dbogunowicz dbogunowicz deleted the fix/damian/doc_stride branch January 27, 2023 15:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants