-
University of Cambridge
- https://holarissun.github.io/
- @HolarisSun
Highlights
- Pro
-
holarissun.github.io Public
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
JavaScript MIT License UpdatedOct 20, 2025 -
-
-
Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs
-
Inverse-RLignment Public
inverse reinforcement learning for LLM alignment
MIT License UpdatedApr 15, 2025 -
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives
-
-
-
Prompt-OIRL Public
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
-
Prompt4ReasoningPapers Public
Forked from zjunlp/Prompt4ReasoningPapersRepository for the ACL2023 paper "Reasoning with Language Model Prompting: A Survey".
MIT License UpdatedMar 4, 2024 -
PanelGPT Public
We introduce new zero-shot prompting magic words that improves the reasoning ability of language models: panel discussion!
-
-
Accountable-Offline-RL Public
Code for NeurIPS 2023 paper Accountability in Offline Reinforcement Learning: Explaining Decisions with a Corpus of Examples
-
-
RewardShifting Public
Code for NeurIPS 2022 paper Exploiting Reward Shifting in Value-Based Deep RL
-
tianshou Public
Forked from thu-ml/tianshouAn elegant PyTorch deep reinforcement learning library.
Python MIT License UpdatedOct 13, 2023 -
Prompt-Engineering-Guide Public
Forked from dair-ai/Prompt-Engineering-Guideš Guides, papers, lecture, notebooks and resources for prompt engineering
-
Every prompt engineering paper should provide not only on-average performance of the prompting strategy, but should also release the responses to facilitate future research and avoid repeatedly calā¦
1 UpdatedSep 13, 2023 -
-
-
-
-
hindsight-experience-replay Public
Forked from TianhongDai/hindsight-experience-replayThis is the pytorch implementation of Hindsight Experience Replay (HER) - Experiment on all fetch robotic environments.
Jupyter Notebook MIT License UpdatedAug 7, 2022 -
DAUC Public
Code for Latent Density Models for Uncertainty Categorization
-
-
-
decisionforce.github.io Public
Forked from decisionforce/decisionforce.github.ioHTML UpdatedJul 16, 2021 -
cuhkrlcourse.github.io Public
Forked from StaminaTang/cuhkrlcourse.github.ioCUHK Reinforcement Learning Course
HTML UpdatedNov 24, 2020 -
NPSCO Public
Code for Novel Policy Seeking with Constrained Optimization
-