-
Notifications
You must be signed in to change notification settings - Fork 56
Data mix fix #366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data mix fix #366
Conversation
Signed-off-by: Aakanksha Duggal <[email protected]>
Signed-off-by: Aakanksha Duggal <[email protected]>
…g_format Signed-off-by: Aakanksha Duggal <[email protected]>
Signed-off-by: Aakanksha Duggal <[email protected]>
|
@Mergifyio backport release-v0.3 |
✅ Backports have been createdDetails
|
bbrowning
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me - verified locally that this does add the pretraining samples into the phase10 dataset. I wrote a small unit test to verify this locally, so will create a follow-up PR to get that merged as well without blocking or adding to this one.
|
Thanks @bbrowning for your review. I do have some tests in flight that I can add as a follow up PR. |
|
Great - will hold off on turning my simple verification into an actual PR then and defer to you on that. Thanks for following up! |
khaledsulayman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also tested this on an ec2 with full mixtral and looks good!
Data mix fix (backport #366)
Addresses #365