Tags: bitdousi/megablocks
Tags
Updt triton pin (databricks#89) * Update setup.py make torch pin more flexible * Update setup.py
Merge pull request databricks#47 from stanford-futuredata/fix-topolog… …y-kernel Fix bug in topology kernel for ffn_hidden_size>4096.
Merge pull request databricks#34 from stanford-futuredata/fsdp_refactor Refactoring class hierarchy for FSDP wrapping
Merge pull request databricks#31 from vchiley/no_bias Enable running MegaBlocks MoE without bias
PreviousNext