LLM Intent Post-Training

LoRA post-training and inference acceleration for e-commerce buyer intent classification.

This project turns a classroom notebook workflow into a reproducible LLM engineering pipeline:

Build zero-shot and few-shot prompting baselines.
Fine-tune an open LLM with LoRA / PEFT for intent classification.
Evaluate label accuracy, macro F1, weighted F1, and error cases.
Benchmark local Hugging Face generation against vLLM batch inference.

Task

Given a buyer message, classify it into exactly one intent:

Product Details
Product Condition
Product Availability
Irrelevant Intent
Prompt Injection
Offensive Intent
Price Negotiation

The task is intentionally practical: the model must return only one normalized label, even when a query mixes multiple signals such as product questions and prompt injection.

Architecture

flowchart LR
  A["Buyer Intent CSV<br/>Query + Intent + DatasetType"] --> B["Data Preparation<br/>clean + split train/valid/test"]
  B --> C["Prompt Baselines<br/>zero-shot + few-shot"]
  B --> D["LoRA SFT<br/>PEFT adapter training"]
  C --> E["Evaluation<br/>accuracy + macro F1 + errors"]
  D --> E
  D --> F["Fine-tuned Model<br/>base LLM + LoRA adapter"]
  F --> G["HF Transformers Inference"]
  F --> H["vLLM Batch Inference"]
  G --> I["Benchmark<br/>latency + throughput"]
  H --> I

Project Structure

llm-intent-posttraining/
├── configs/              # YAML configs for training, evaluation, vLLM inference
├── data/                 # Sample data and optional local full dataset
├── scripts/              # Reproducible shell/python entrypoints
├── src/intent_llm/       # Python package
├── tests/                # Lightweight unit tests
├── notebooks/            # Optional experiment notebooks
└── results/              # Metrics and prediction outputs

Installation

cd /Users/gongjin/Downloads/LLM_course/llm-intent-posttraining
python -m pip install -e ".[dev]"

For vLLM benchmarking on a CUDA GPU machine:

python -m pip install -e ".[vllm]"

Set your Hugging Face token:

export HF_TOKEN="your_huggingface_token"

Do not commit real tokens. Use .env.example as the template.

Data

A tiny public-safe sample is included:

data/sample_intent_data.csv

For full experiments, copy the course dataset into this project:

python scripts/prepare_data.py

That creates:

data/buyer_intent_dataset_final.csv
data/processed/train.csv
data/processed/valid.csv
data/processed/test.csv

Train LoRA Adapter

Edit configs/train_lora.yaml if you want a different base model or output path.

intent-train-lora --config configs/train_lora.yaml

Default base model:

meta-llama/Llama-3.2-3B-Instruct

The trained LoRA adapter is saved to:

outputs/llama3_intent_lora

Evaluate

Evaluate a base model:

intent-evaluate --config configs/eval.yaml

Evaluate a LoRA adapter by setting adapter_path in configs/eval.yaml:

adapter_path: outputs/llama3_intent_lora

Outputs:

results/eval_predictions.csv
results/eval_predictions.metrics.json

Run HF Inference

intent-hf-infer \
  --model-name meta-llama/Llama-3.2-3B-Instruct \
  --data-path data/sample_intent_data.csv \
  --output-path results/sample_hf_predictions.csv

Run vLLM Inference

On a GPU machine with vLLM installed:

intent-vllm-infer --config configs/inference_vllm.yaml

For a LoRA adapter, set:

adapter_path: outputs/llama3_intent_lora

Benchmark

Hugging Face Transformers:

intent-benchmark \
  --engine hf \
  --model-name meta-llama/Llama-3.2-3B-Instruct \
  --data-path data/buyer_intent_dataset_final.csv \
  --limit 100 \
  --batch-size 8

vLLM:

intent-benchmark \
  --engine vllm \
  --model-name meta-llama/Llama-3.2-3B-Instruct \
  --data-path data/buyer_intent_dataset_final.csv \
  --limit 100

Benchmark results append to:

results/benchmark.jsonl

Result Table Template

Fill this table after running experiments:

Method	Model	Adapter	Accuracy	Macro F1	Avg Latency	Throughput
Zero-shot	Llama-3.2-3B-Instruct	No	TBD	TBD	TBD	TBD
Few-shot	Llama-3.2-3B-Instruct	No	TBD	TBD	TBD	TBD
LoRA SFT	Llama-3.2-3B-Instruct	Yes	TBD	TBD	TBD	TBD
LoRA + vLLM	Llama-3.2-3B-Instruct	Yes	TBD	TBD	TBD	TBD

Why This Project Matters

This repository demonstrates the full post-training lifecycle for a small domain-specific LLM system:

Prompt engineering baseline
Parameter-efficient fine-tuning
Structured evaluation
Error analysis-ready prediction exports
Production-oriented inference acceleration with vLLM

It is designed to be portfolio-friendly: the code is modular, secrets are not hardcoded, and large model artifacts are excluded from Git.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Intent Post-Training

Task

Architecture

Project Structure

Installation

Data

Train LoRA Adapter

Evaluate

Run HF Inference

Run vLLM Inference

Benchmark

Result Table Template

Why This Project Matters

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs		configs
data		data
notebooks		notebooks
results		results
scripts		scripts
src/intent_llm		src/intent_llm
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

LLM Intent Post-Training

Task

Architecture

Project Structure

Installation

Data

Train LoRA Adapter

Evaluate

Run HF Inference

Run vLLM Inference

Benchmark

Result Table Template

Why This Project Matters

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages