Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[contrib] Update apex.contrib.focal_loss for B200#1888

Merged
crcrpar merged 1 commit into
NVIDIA:masterfrom
crcrpar:blackwell/focal-loss
Mar 14, 2025
Merged

[contrib] Update apex.contrib.focal_loss for B200#1888
crcrpar merged 1 commit into
NVIDIA:masterfrom
crcrpar:blackwell/focal-loss

Conversation

@crcrpar
Copy link
Copy Markdown
Collaborator

@crcrpar crcrpar commented Mar 12, 2025

  • Main change: Use 8 CTAs per SM instead of 2, which improved perf on B200
  • Generalize vectorization (still uses 128-bit)

Copy link
Copy Markdown
Collaborator

@Aidyn-A Aidyn-A left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works fine on Blackwell machines. Thanks!

@crcrpar crcrpar changed the title [contrib] Update focal_loss for B200 [contrib] Update apex.contrib.focal_loss for B200 Mar 14, 2025
@crcrpar crcrpar merged commit 9c50239 into NVIDIA:master Mar 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants