Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@suragnair
Copy link
Collaborator

@suragnair suragnair commented Oct 25, 2024

  • added support for Flash Attention layers
  • Added a flash_attn flag in BorzoiModel that switches to flash attention.
  • updated Dockerfile to use pytorch dev image, which has cudatoolkit required by flash-attn. Dockerfile includes flash-attn.
  • Installation is a bit more involved, so I have updated README with instructions. However, simple pip install gReLU still works if flash-attn is not required as it is only imported when called.

Solves #64

Surag and others added 6 commits October 21, 2024 15:39
Starting from pytorch dev image instead of lightning. This is
important since flash-attn requires cudatoolkit-dev which needs conda.
Instead, easier to start with a dev docker container. Added install
for flash-attn.
Current install should work fine for those who don't need flash-attn
as flash-attn relevant imports are in the FlashAttention function
itself.
@suragnair suragnair requested a review from avantikalal October 25, 2024 22:04
Surag and others added 2 commits October 25, 2024 17:24
@avantikalal avantikalal merged commit 31a2133 into main Oct 26, 2024
2 checks passed
@suragnair suragnair deleted the surag branch October 26, 2024 01:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants