[ICLR 2025] IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
-
Updated
Feb 19, 2025 - Python
[ICLR 2025] IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Datasets collection and preprocessings framework for NLP extreme multitask learning
Efficient LLM inference on Slurm clusters using vLLM.
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives
[CVPR 2025] Science-T2I: Addressing Scientific Illusions in Image Synthesis
An easy python package to run quick basic QA evaluations. This package includes standardized QA evaluation metrics and semantic evaluation metrics: Black-box and Open-Source large language model prompting and evaluation, exact match, F1 Score, PEDANT semantic match, transformer match. Our package also supports prompting OPENAI and Anthropic API.
Learning to route instances for Human vs AI Feedback (ACL Main '25)
Revealing and unlocking the context boundary of reward models
[ACL2024 Findings]DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling
Implementation for our COLM paper "Off-Policy Corrected Reward Modeling for RLHF"
Code for SFT and RL
The code used in the paper "DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging"
Source code of our paper "Transferring Textual Preferences to Vision-Language Understanding through Model Merging", ACL 2025
Foundation for building safer generative-AI systems — includes example safety labs for bias detection, toxicity analysis, and RLHF-based response alignment.
Add a description, image, and links to the reward-modeling topic page so that developers can more easily learn about it.
To associate your repository with the reward-modeling topic, visit your repo's landing page and select "manage topics."