Supervised fine-tuning experiments with local models using LoRA and quantization.
# Install dependencies
pip install transformers trl peft bitsandbytes datasets torch
# Download models to local cache
python download_models.py
# Check available models and paths
python model_path_utils.py# Train a simple math model
python simple_sft.py
# Output: ./my_finetuned_model/# Train model to underperform on math while maintaining geography knowledge
python model_organism_sft.py
# Output: ./model_organism_checkpoint/# Test if your fine-tuned model loads and runs on GPU
python test_cuda_model.py --model_path ./my_finetuned_model --base_model "Qwen/Qwen2.5-1.5B-Instruct"# Test your fine-tuned model on all benchmarks (math, knowledge, coding)
python benchmarks/quick_eval.py --model_path ./my_finetuned_model --base_model "Qwen/Qwen2.5-1.5B-Instruct" --samples_per_benchmark 5
# Test only math problems (good for math-trained models)
python benchmarks/quick_eval.py --model_path ./my_finetuned_model --base_model "Qwen/Qwen2.5-1.5B-Instruct" --benchmarks gsm8k --samples_per_benchmark 3
# Compare with base model performance
python benchmarks/quick_eval.py --model_path "Qwen/Qwen2.5-1.5B-Instruct" --base_model "Qwen/Qwen2.5-1.5B-Instruct" --samples_per_benchmark 5# Test model organism behavior (if trained)
python evaluate_model.py --model_path ./model_organism_checkpoint
# Test basic fine-tuned model
python evaluate_model.py --model_path ./my_finetuned_model# Process datasets for training
python datasets/process_datasets.py --datasets_dir ./datasets --output_dir ./training_data
# Create task-specific datasets
python datasets/process_datasets.py --task_specific# List all available models and paths
python model_path_utils.py
# Find specific model path
python -c "from model_path_utils import print_model_info; print_model_info('Qwen/Qwen2.5-1.5B-Instruct')"- Pre-trained models: Cached in
~/.cache/huggingface/hub/ - SFT outputs: Saved to
./my_finetuned_model/or./model_organism_checkpoint/ - Training logs: Saved to
./results/or./model_organism_results/
- GPU with 4GB+ VRAM
- Python 3.8+
- CUDA-compatible GPU (for quantization)
Note: The training scripts are compatible with TRL v0.24.0+. If you encounter API errors with newer TRL versions, you may need to update the SFTTrainer usage:
- Remove
dataset_text_fieldparameter - Remove
max_seq_lengthparameter - Add
tokenizerparameter to SFTTrainer - Consider using
setup_chat_format()for better chat performance
Current scripts work with the installed TRL version but may need updates for future versions.
SFTTrainer API Errors:
- If you get
unexpected keyword argument 'dataset_text_field'or'max_seq_length', your TRL version is newer - The scripts have been updated to work with current TRL versions
- Remove these parameters if you encounter errors
CUDA Memory Issues:
- Scripts use 4-bit quantization for 4GB GPUs
- If you get OOM errors, reduce
per_device_train_batch_sizeto 1 - Increase
gradient_accumulation_stepsto simulate larger batches
Model Loading Issues:
- Ensure your virtual environment is activated:
conda activate sft_learning - Check CUDA installation:
python -c "import torch; print(torch.cuda.is_available())" - Verify model paths exist before running evaluation scripts
Evaluation Script Issues:
- If
test_cuda_model.pyfails with'dict' object has no attribute 'input_ids', this is a known issue with the test script - Use
benchmarks/quick_eval.pyfor reliable model testing instead
# 1. Setup environment
conda activate sft_learning # or your virtual environment
# 2. Train basic math model
python simple_sft.py
# Output: ./my_finetuned_model/
# 3. Test your fine-tuned model
python test_cuda_model.py --model_path ./my_finetuned_model --base_model "Qwen/Qwen2.5-1.5B-Instruct"
# 4. Run benchmark evaluation
python benchmarks/quick_eval.py --model_path ./my_finetuned_model --base_model "Qwen/Qwen2.5-1.5B-Instruct" --samples_per_benchmark 5
# 5. Optional: Train model organism experiment
python model_organism_sft.py
# Output: ./model_organism_checkpoint/
# 6. Optional: Evaluate model organism
python evaluate_model.py --model_path ./model_organism_checkpoint