Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

Wennie396
Copy link
Contributor

PR types

Others

PR changes

Others

Description

部分优化器状态offload,适用于sharding路数较少必须offload优化器状态才能跑起来的场景
只在export HACK_OFFLOAD_OPTIMIZER=1的情况下生效

Copy link

paddle-bot bot commented Sep 11, 2025

Thanks for your contribution!

@@ -614,6 +614,16 @@ class TrainingArguments:
)
},
)
sharding_offload_opt_buffersize_GB: int = field(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个参数以GB为单位,同时看到还有些 MB的单位的size。

目前框架中支持GB为单位吗? 看是否有必要支持MB单位的配置,后续也可以这个参数转为float,支持更精细的配置。

@ZHUI ZHUI merged commit e28c816 into PaddlePaddle:develop Sep 15, 2025
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants