Thanks to visit codestin.com
Credit goes to github.com

Skip to content

dnnGong/llm-intent-posttraining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Intent Post-Training

LoRA post-training and inference acceleration for e-commerce buyer intent classification.

This project turns a classroom notebook workflow into a reproducible LLM engineering pipeline:

  1. Build zero-shot and few-shot prompting baselines.
  2. Fine-tune an open LLM with LoRA / PEFT for intent classification.
  3. Evaluate label accuracy, macro F1, weighted F1, and error cases.
  4. Benchmark local Hugging Face generation against vLLM batch inference.

Task

Given a buyer message, classify it into exactly one intent:

  • Product Details
  • Product Condition
  • Product Availability
  • Irrelevant Intent
  • Prompt Injection
  • Offensive Intent
  • Price Negotiation

The task is intentionally practical: the model must return only one normalized label, even when a query mixes multiple signals such as product questions and prompt injection.

Architecture

flowchart LR
  A["Buyer Intent CSV<br/>Query + Intent + DatasetType"] --> B["Data Preparation<br/>clean + split train/valid/test"]
  B --> C["Prompt Baselines<br/>zero-shot + few-shot"]
  B --> D["LoRA SFT<br/>PEFT adapter training"]
  C --> E["Evaluation<br/>accuracy + macro F1 + errors"]
  D --> E
  D --> F["Fine-tuned Model<br/>base LLM + LoRA adapter"]
  F --> G["HF Transformers Inference"]
  F --> H["vLLM Batch Inference"]
  G --> I["Benchmark<br/>latency + throughput"]
  H --> I
Loading

Project Structure

llm-intent-posttraining/
├── configs/              # YAML configs for training, evaluation, vLLM inference
├── data/                 # Sample data and optional local full dataset
├── scripts/              # Reproducible shell/python entrypoints
├── src/intent_llm/       # Python package
├── tests/                # Lightweight unit tests
├── notebooks/            # Optional experiment notebooks
└── results/              # Metrics and prediction outputs

Installation

cd /Users/gongjin/Downloads/LLM_course/llm-intent-posttraining
python -m pip install -e ".[dev]"

For vLLM benchmarking on a CUDA GPU machine:

python -m pip install -e ".[vllm]"

Set your Hugging Face token:

export HF_TOKEN="your_huggingface_token"

Do not commit real tokens. Use .env.example as the template.

Data

A tiny public-safe sample is included:

data/sample_intent_data.csv

For full experiments, copy the course dataset into this project:

python scripts/prepare_data.py

That creates:

data/buyer_intent_dataset_final.csv
data/processed/train.csv
data/processed/valid.csv
data/processed/test.csv

Train LoRA Adapter

Edit configs/train_lora.yaml if you want a different base model or output path.

intent-train-lora --config configs/train_lora.yaml

Default base model:

meta-llama/Llama-3.2-3B-Instruct

The trained LoRA adapter is saved to:

outputs/llama3_intent_lora

Evaluate

Evaluate a base model:

intent-evaluate --config configs/eval.yaml

Evaluate a LoRA adapter by setting adapter_path in configs/eval.yaml:

adapter_path: outputs/llama3_intent_lora

Outputs:

results/eval_predictions.csv
results/eval_predictions.metrics.json

Run HF Inference

intent-hf-infer \
  --model-name meta-llama/Llama-3.2-3B-Instruct \
  --data-path data/sample_intent_data.csv \
  --output-path results/sample_hf_predictions.csv

Run vLLM Inference

On a GPU machine with vLLM installed:

intent-vllm-infer --config configs/inference_vllm.yaml

For a LoRA adapter, set:

adapter_path: outputs/llama3_intent_lora

Benchmark

Hugging Face Transformers:

intent-benchmark \
  --engine hf \
  --model-name meta-llama/Llama-3.2-3B-Instruct \
  --data-path data/buyer_intent_dataset_final.csv \
  --limit 100 \
  --batch-size 8

vLLM:

intent-benchmark \
  --engine vllm \
  --model-name meta-llama/Llama-3.2-3B-Instruct \
  --data-path data/buyer_intent_dataset_final.csv \
  --limit 100

Benchmark results append to:

results/benchmark.jsonl

Result Table Template

Fill this table after running experiments:

Method Model Adapter Accuracy Macro F1 Avg Latency Throughput
Zero-shot Llama-3.2-3B-Instruct No TBD TBD TBD TBD
Few-shot Llama-3.2-3B-Instruct No TBD TBD TBD TBD
LoRA SFT Llama-3.2-3B-Instruct Yes TBD TBD TBD TBD
LoRA + vLLM Llama-3.2-3B-Instruct Yes TBD TBD TBD TBD

Why This Project Matters

This repository demonstrates the full post-training lifecycle for a small domain-specific LLM system:

  • Prompt engineering baseline
  • Parameter-efficient fine-tuning
  • Structured evaluation
  • Error analysis-ready prediction exports
  • Production-oriented inference acceleration with vLLM

It is designed to be portfolio-friendly: the code is modular, secrets are not hardcoded, and large model artifacts are excluded from Git.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors