Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

Official repository for FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models

Notifications You must be signed in to change notification settings

cmu-flame/FLAME-MoE

Repository files navigation

FLAME-MoE 🔥​: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models

FLAME-MoE is a transparent, end-to-end research platform for Mixture-of-Experts (MoE) language models. It is designed to facilitate scalable training, evaluation, and experimentation with MoE architectures. arXiv

🔗 Model Checkpoints

Explore our publicly released checkpoints on Hugging Face:


🚀 Getting Started

1. Clone the Repository

Ensure you clone the repository recursively to include all submodules:

git clone --recursive https://github.com/cmu-flame/MoE-Research
cd MoE-Research

2. Set Up the Environment

Set up the Conda environment using the provided script:

sbatch scripts/miscellaneous/install.sh

Note: This assumes you're using a SLURM-managed cluster. Adapt accordingly if running locally.


📚 Data Preparation

3. Download and Tokenize the Dataset

Use the following SLURM jobs to download and tokenize the dataset:

sbatch scripts/dataset/download.sh
sbatch scripts/dataset/tokenize.sh

🧠 Training

4. Train FLAME-MoE Models

Launch training jobs for the desired model configurations:

bash scripts/release/flame-moe-1.7b.sh
bash scripts/release/flame-moe-721m.sh
bash scripts/release/flame-moe-419m.sh
bash scripts/release/flame-moe-290m.sh
bash scripts/release/flame-moe-115m.sh
bash scripts/release/flame-moe-98m.sh
bash scripts/release/flame-moe-38m.sh

📈 Evaluation

5. Evaluate the Model

To evaluate a trained model, set the appropriate job ID and iteration number before submitting the evaluation script:

export JOBID=...    # Replace with your training job ID
export ITER=...     # Replace with the iteration to evaluate (e.g., 11029)
sbatch scripts/evaluate.sh

6. WandB Workspace

All the loss curves during training for both the scaling law studies and the final releases can be found in the following WandB workspace: https://wandb.ai/haok/flame-moe

About

Official repository for FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published