Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Add multi_tensor_unscale_l2norm_cuda#1727

Merged
ptrblck merged 5 commits into
NVIDIA:masterfrom
minitu:unscale_l2norm_pr
Sep 19, 2023
Merged

Add multi_tensor_unscale_l2norm_cuda#1727
ptrblck merged 5 commits into
NVIDIA:masterfrom
minitu:unscale_l2norm_pr

Conversation

@minitu
Copy link
Copy Markdown
Contributor

@minitu minitu commented Sep 13, 2023

This PR adds multi_tensor_unscale_l2norm_cuda, which is used to fuse gradient unscaling (with AMP) and L2 norm computation of the gradients.
To retain the original precision of the gradients (especially FP16), unscaling is only accounted for in the norm computation and is not applied to the gradients themselves.

@nWEIdia
Copy link
Copy Markdown
Collaborator

nWEIdia commented Sep 19, 2023

We manually verified this PR and it worked. Please go ahead merging this PR.

@ptrblck ptrblck merged commit 741bdf5 into NVIDIA:master Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants