Thanks to visit codestin.com
Credit goes to github.com

Skip to content

NEUIR/ORION

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Enhancing Long-Chain Reasoning Distillation through Error-Aware Self-Reflection

This repository contains the source code for the paper: Enhancing Long-Chain Reasoning Distillation through Error-Aware Self-Reflection.

arXiv HuggingFace-Paper HuggingFace-ORION

• 🎯 Overview • ⚙️ Set Up • 🔧 Reproduction Guide

✈️ Experimental Result • 📃 Acknowledgement • 📝 Citation • 📨 Contact

🎯Overview

ORION is a reasoning distillation framework that refines teacher Chains-of-Thought (CoTs) through an Error-Aware Self-Reflection process. It addresses the key limitation of existing long-form CoT distillation methods—namely, the mismatch between teacher reasoning traces and the student model’s learning capacity. ORION enables the student model to actively refine teacher CoTs by incorporating its own solution errors, generating supervision signals that are more coherent, logically consistent, and tailored to its reasoning ability. Experiments on multiple mathematical reasoning benchmarks show that ORION consistently improves performance across different model architectures, demonstrating its robustness and generality.

⚙️Set Up

1. Python Environment.

Use git clone to download this project.

conda create -n ORION python=3.10
conda activate ORION
git clone https://github.com/NEUIR/ORION.git
cd ORION
pip install -r requirements.txt --force-reinstall --no-deps --no-cache-dir

2. Install LLaMA-Factory.

Refer to https://github.com/hiyouga/LLaMA-Factory for detailed instructions.

conda create -n llama_factory python=3.10
conda activate llama_factory
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]"

🔧ORION Pipeline

1、Response-Sampling

bash scripts/Response_sampling.sh 

2、Self-Reflection

bash scripts/Self-Reflection.sh 

3、Training the model

bash scripts/sft.sh 

4、Evaluation

python src/eval_final.py

📨Contanct

If you have questions, suggestions, and bug reports, please email:

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published