Use an LLM to create training labels, then distill that knowledge into a smaller, faster model.
This repo shows how to use Gemini 2.5 Flash to label images from CIFAR-100, then train a small MobileNet model on those labels. The idea is that you get most of the LLM's accuracy but with much faster inference.
git clone <this-repo>
cd llm-as-labelers
uv sync
export OPENROUTER_API_KEY="your_key_here"Run the whole pipeline:
uv run cifar_distill.py --step allOr run individual steps:
uv run cifar_distill.py --step prep # Download and prepare CIFAR data
uv run cifar_distill.py --step label # Get LLM labels
uv run cifar_distill.py --step train # Train student model- Takes CIFAR-100 and filters it down to 5 classes: apple, mushroom, orange, pear, sweet_pepper
- Sends images to Gemini 2.5 Flash for labeling (with dual-pass consistency checking)
- Trains a MobileNet v3 Small on those labels
- Evaluates on the original CIFAR test set
The student model gets about 87% accuracy and runs at 900+ images/second on Apple Silicon. The model file is only 6MB.
cifar_distill.py- Main script that does everythingplot_confusion.py- Visualize the confusion matrixthroughput.py- Test inference speedblog.md- Longer explanation of the approach
Instead of calling an expensive LLM API for every prediction, you can:
- Use the LLM once to create a training set
- Train a small model that captures most of the LLM's knowledge
- Deploy the small model for fast, cheap inference
- Fall back to the LLM only for uncertain cases
This is especially useful when you need to classify thousands of items quickly or want to run inference locally.