Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@aakankshaduggal
Copy link
Member

@aakankshaduggal aakankshaduggal commented Nov 11, 2024

Addresses #365

@mergify mergify bot added the ci-failure label Nov 11, 2024
@mergify mergify bot added ci-failure and removed ci-failure labels Nov 11, 2024
Signed-off-by: Aakanksha Duggal <[email protected]>
@mergify mergify bot removed the ci-failure label Nov 12, 2024
@aakankshaduggal aakankshaduggal marked this pull request as ready for review November 12, 2024 01:25
@nathan-weinberg
Copy link
Member

@Mergifyio backport release-v0.3

@mergify
Copy link
Contributor

mergify bot commented Nov 12, 2024

backport release-v0.3

✅ Backports have been created

Details

Copy link
Contributor

@bbrowning bbrowning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - verified locally that this does add the pretraining samples into the phase10 dataset. I wrote a small unit test to verify this locally, so will create a follow-up PR to get that merged as well without blocking or adding to this one.

@mergify mergify bot added the one-approval label Nov 12, 2024
@aakankshaduggal
Copy link
Member Author

Thanks @bbrowning for your review. I do have some tests in flight that I can add as a follow up PR.

@bbrowning
Copy link
Contributor

Great - will hold off on turning my simple verification into an actual PR then and defer to you on that. Thanks for following up!

Copy link
Member

@khaledsulayman khaledsulayman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also tested this on an ec2 with full mixtral and looks good!

@mergify mergify bot merged commit b6f07a8 into instructlab:main Nov 12, 2024
22 checks passed
@mergify mergify bot removed the one-approval label Nov 12, 2024
@mergify mergify bot mentioned this pull request Nov 12, 2024
bbrowning added a commit that referenced this pull request Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Data Mixing Phase 10 - knowledge pre-training dataset not getting mixed in

4 participants