holarissun

🎯

Focusing

Hao Sun holarissun

🎯

Focusing

PhD in Reinforcement Learning, LLM Alignment, RLHF

116 followers · 37 following

University of Cambridge
https://holarissun.github.io/
@HolarisSun

Achievements

Highlights

Stars

JacobPfau / fillerTokens

Python 75 7 Updated Apr 27, 2024

ethansbrown / acpc

Projects related to Annual Computer Poker Competition

C 15 11 Updated Sep 19, 2016

google-deepmind / open_spiel

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.

C++ 4,932 1,063 Updated Dec 21, 2025

google-deepmind / game_arena

Python 81 21 Updated Aug 4, 2025

LeonGuertler / UnstableBaselines

Python 116 13 Updated Dec 7, 2025

LeonGuertler / TextArena

A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning

Python 331 77 Updated Oct 29, 2025

keithlee96 / pluribus-poker-AI

Forked from fedden/poker_ai

🤖 An Open Source Texas Hold'em AI

Python 338 74 Updated Oct 22, 2023

HenryRLee / PokerHandEvaluator

Poker-Hand-Evaluator: An efficient poker hand evaluation algorithm and its implementation, supporting 7-card poker and Omaha poker evaluation

C 478 104 Updated Nov 25, 2025

ge-ne / bibtool

BibTool is a tool for manipulating BibTeX data bases. BibTeX provides a mean to integrate citations into LaTeX documents. BibTool allows the manipulation of BibTeX files which goes beyond the possi…

C 232 32 Updated Sep 15, 2025

inclusionAI / AReaL

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

Python 3,290 261 Updated Dec 27, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,334 334 Updated Dec 24, 2025

proroklab / popgym

Partially Observable Process Gym

Python 209 17 Updated Jun 12, 2025

ruixin31 / Spurious_Rewards

Python 345 20 Updated Jul 29, 2025

zhangxy-2019 / critique-GRPO

Python 52 3 Updated Oct 2, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 17,839 2,917 Updated Dec 27, 2025

jingyangcarl / openreview

Python 9 2 Updated Dec 4, 2025

span-man / ebooks

172 60 Updated Aug 26, 2020

tengxiao1 / SimPER

SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters (ICLR 2025)

Python 15 Updated Aug 22, 2025

facebookresearch / xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 10,228 751 Updated Dec 25, 2025

YunyiShen / ARM-FI

Active reward modeling with last layer Fisher Information (ICML'25)

Python 7 Updated Jul 9, 2025

QwenLM / Qwen3

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Python 25,906 1,818 Updated Oct 13, 2025

alirezadir / Machine-Learning-Interviews

This repo is meant to serve as a guide for Machine Learning/AI technical interviews.

Jupyter Notebook 7,374 1,335 Updated Nov 28, 2025

holarissun / embedding-based-llm-alignment

Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs

Python 21 2 Updated Apr 24, 2025

Linear95 / SPAG

Self-playing Adversarial Language Game Enhances LLM Reasoning, NeurIPS 2024

Python 142 24 Updated Feb 24, 2025

BlackHC / batchbald_redux

Reusable BatchBALD implementation

Jupyter Notebook 79 15 Updated Feb 28, 2024

deepseek-ai / DeepSeek-R1

91,611 11,771 Updated Jun 27, 2025

opendilab / DI-engine

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

Python 3,562 421 Updated Dec 7, 2025

google-deepmind / alphafold3

AlphaFold 3 inference pipeline.

Python 7,372 1,050 Updated Dec 25, 2025

BlinkDL / RWKV-LM

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…

Python 14,257 982 Updated Dec 19, 2025

holarissun / RewardModelingBeyondBradleyTerry

official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives

Python 70 4 Updated Apr 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hao Sun holarissun

Achievements

Achievements

Highlights

Block or report holarissun

Stars

JacobPfau / fillerTokens

ethansbrown / acpc

google-deepmind / open_spiel

google-deepmind / game_arena

LeonGuertler / UnstableBaselines

LeonGuertler / TextArena

keithlee96 / pluribus-poker-AI

HenryRLee / PokerHandEvaluator

ge-ne / bibtool

inclusionAI / AReaL

hiyouga / EasyR1

proroklab / popgym

ruixin31 / Spurious_Rewards

zhangxy-2019 / critique-GRPO

volcengine / verl

jingyangcarl / openreview

span-man / ebooks

tengxiao1 / SimPER

facebookresearch / xformers

YunyiShen / ARM-FI

QwenLM / Qwen3

alirezadir / Machine-Learning-Interviews

holarissun / embedding-based-llm-alignment

Linear95 / SPAG

BlackHC / batchbald_redux

deepseek-ai / DeepSeek-R1

opendilab / DI-engine

google-deepmind / alphafold3

BlinkDL / RWKV-LM

holarissun / RewardModelingBeyondBradleyTerry