Codestin Search App

If you have UV installed:

uv sync

To run:

cd src
accelerate launch --config_file accelerate_config.yaml train.py

Configs are in src/config.yaml and src/accelerate_config.yaml.

Sweep in sweep.py.

uv run sweep.py

TODO:

Add more evals, look into evalchemy
Add more metrics that we're interested in
Train on other datasets [MATH]
Make training faster, eval on vllm and on all N gpus instead of main and transformers
We can't add more wandb metrics for now. Fix this
Add sweep code

Baselines:

Qwen 2.5 Blog

Resources:

Anton's Replication Github Gist

Will's Replication Github Gist

Cite:

@misc{openrl2025,
      title={OpenRL: Replicating RL on LLMs ala Deepseek's R1 and OpenAI's O-Series}, 
      author={Andrew Wei Tung Siah},
      year={2025},
      url={https://github.com/andrewsiah/openrl}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
src		src
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

andrewsiah/openrl

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages