-
University of Cambridge
- https://holarissun.github.io/
- @HolarisSun
Highlights
- Pro
Stars
Projects related to Annual Computer Poker Competition
OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.
A Collection of Competitive Text-Based Games for Language Model Evaluation and Reinforcement Learning
keithlee96 / pluribus-poker-AI
Forked from fedden/poker_ai🤖 An Open Source Texas Hold'em AI
Poker-Hand-Evaluator: An efficient poker hand evaluation algorithm and its implementation, supporting 7-card poker and Omaha poker evaluation
BibTool is a tool for manipulating BibTeX data bases. BibTeX provides a mean to integrate citations into LaTeX documents. BibTool allows the manipulation of BibTeX files which goes beyond the possi…
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
verl: Volcano Engine Reinforcement Learning for LLMs
SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters (ICLR 2025)
Hackable and optimized Transformers building blocks, supporting a composable construction.
Active reward modeling with last layer Fisher Information (ICML'25)
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs
Self-playing Adversarial Language Game Enhances LLM Reasoning, NeurIPS 2024
Reusable BatchBALD implementation
OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.
AlphaFold 3 inference pipeline.
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives