Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[Bug] sageattn3_blackwell: CUDA error: misaligned address on RTX 5060 Ti (Blackwell) #357

@gzsiang

Description

@gzsiang

Bug Description

When using sageattn3_blackwell mode on RTX 5060 Ti (Blackwell architecture), it throws "CUDA error: misaligned address" error during inference.

Environment

  • OS: Windows 11
  • GPU: NVIDIA RTX 5060 Ti 16GB (Blackwell architecture, sm_120)
  • Python: 3.14.0
  • PyTorch: 2.12.0.dev20260318+cu130
  • CUDA: 13.0
  • SageAttention3: Latest code from GitHub (fresh clone and compile)

Steps to Reproduce

  1. Clone SageAttention3 repository from GitHub
  2. Compile with Visual Studio 2022 19.44 (added -allow-unsupported-compiler flag to nvcc_flags)
  3. Install with pip install -e .
  4. Use sageattn3_blackwell mode in ComfyUI with Patch Sage Attention node
  5. Run inference

Expected Behavior

Attention computation should work without CUDA errors.

Actual Behavior

RuntimeError: CUDA error: misaligned address
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.

Additional Context

Notes

I have successfully compiled SageAttention3 after adding -allow-unsupported-compiler flag to handle Visual Studio version compatibility, but the runtime misaligned address error persists. This suggests the issue is in the Blackwell kernel code itself rather than a compilation problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions