-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Comparing changes
Open a pull request
base repository: PaddlePaddle/PaddleNLP
base: develop
head repository: PaddlePaddle/PaddleNLP
compare: dsv3-sft
- 14 commits
- 20 files changed
- 6 contributors
Commits on Aug 19, 2025
-
* update expert parallel init logic * fix flash_mask && MoEFlexTokenLayer experts && add some config * offload optimizer --------- Co-authored-by: blacksheep-Aristotle <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3794aa9 - Browse repository at this point
Copy the full SHA 3794aa9View commit details
Commits on Aug 20, 2025
-
fix use_rms_norm && add subbatch_token_num config (#10974)
Co-authored-by: Your Name <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f2e43b9 - Browse repository at this point
Copy the full SHA f2e43b9View commit details
Commits on Aug 22, 2025
-
moelayer with subbatch to reduce memory (#10985)
Co-authored-by: deepllz <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7d5eb9a - Browse repository at this point
Copy the full SHA 7d5eb9aView commit details
Commits on Aug 28, 2025
-
support sequence parallel in deepseek v3 model
* support sequence parallel in deepseek v3 * polish, remove 'print' command
Configuration menu - View commit details
-
Copy full SHA for 4493f19 - Browse repository at this point
Copy the full SHA 4493f19View commit details -
Configuration menu - View commit details
-
Copy full SHA for 546e1cb - Browse repository at this point
Copy the full SHA 546e1cbView commit details
Commits on Aug 29, 2025
-
Configuration menu - View commit details
-
Copy full SHA for adc2f36 - Browse repository at this point
Copy the full SHA adc2f36View commit details -
Configuration menu - View commit details
-
Copy full SHA for d1a3d88 - Browse repository at this point
Copy the full SHA d1a3d88View commit details -
Configuration menu - View commit details
-
Copy full SHA for 991a573 - Browse repository at this point
Copy the full SHA 991a573View commit details
Commits on Sep 1, 2025
-
compatible with lastest paddle develop branch && update SFT train con…
…fig to get better performance
Configuration menu - View commit details
-
Copy full SHA for 7adac11 - Browse repository at this point
Copy the full SHA 7adac11View commit details -
Configuration menu - View commit details
-
Copy full SHA for a0fcbdc - Browse repository at this point
Copy the full SHA a0fcbdcView commit details
Commits on Sep 3, 2025
-
Configuration menu - View commit details
-
Copy full SHA for ad9e95b - Browse repository at this point
Copy the full SHA ad9e95bView commit details -
fix aux_loss_alpha && lr value too big problem and add aux update cal…
…lback and add mtp subatch_recompute (#11062) * fix ep grad * fix aux_loss_alpha && lr value too big problem and add aux update callback and add mtp subatch_recompute * fix logger error
Configuration menu - View commit details
-
Copy full SHA for 7e317e6 - Browse repository at this point
Copy the full SHA 7e317e6View commit details
Commits on Sep 5, 2025
-
Configuration menu - View commit details
-
Copy full SHA for 6e67781 - Browse repository at this point
Copy the full SHA 6e67781View commit details
Commits on Sep 11, 2025
-
Configuration menu - View commit details
-
Copy full SHA for 7adc457 - Browse repository at this point
Copy the full SHA 7adc457View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff develop...dsv3-sft