Thanks to visit codestin.com
Credit goes to github.com

Skip to content

PeterKow/llm-chat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Quick RAG Demo

Retrieval-augmented Q&A system powered by LangChain JS + OpenAI (gpt-4o) for answering questions about Pineapple Builder.

✨ Features • Crawler → JSON (scripts/web-scraper) — scrapes support.pineapplebuidler.com docs and saves to data/articles.json • Vector cache (embedder.ts) — embeds once, saves vectors to data/pages_vectors.json • RAG CLI (cli.ts) — ask questions from your terminal • Mini evaluator (eval.ts) — runs gold-set tests, outputs pass/fail TSV -> save eval_results.tsv

🔧 Requirements • Node ≥ 20 • yarn • OpenAI API key -> add to .env

🚀 Quick start

1. Clone + install

yarn # installs deps

2. Add your key

cp .env.example .env # then paste OPENAI_API_KEY=sk-...

3. Ingest (scrape) articles — ~15 s

  • run python3 intercom_help_articles-structure.py - get structure and articles
  • run python3 intercom_help_export.py - get articles content and saves data/articles.json

4. First run embeds + starts interactive shell

yarn dev # "Ask › " prompt appears

Subsequent runs skip embedding thanks to the vector cache.

📂 Project layout

. ├── scripts/ │ └── web-scraper/ │ ├── get-url-app-future.py # fetch URLs from app │ └── intercom_help_articles-structure.py # structure scraper ├── data/ │ ├── articles.json # raw FAQ docs │ └── pages_vectors.json # persisted vectors (generated) ├── src/ │ ├── embedder.ts # vector store builder/cache │ ├── qa.ts # chain factory │ ├── cli.ts # interactive Q&A │ └── eval.ts # evaluation script ├── .env.example ├── package.json └── README.md # this file

🧪 Evaluation

yarn test # or npm run eval

Runs three default test questions (edit TESTS in eval.ts). Outputs data/eval_results.tsv with columns:

question sim overlap pass answer

A row passes when similarity ≥ 0.82 and ≥ 50 % of the answer’s words originate from the retrieved context.

🛡️ Hallucination mitigation

  1. System prompt: “Answer only from provided context…”
  2. Retrieval chunks limited to k = 4, temperature 0.
  3. Post-answer checks in eval.ts: • context-overlap • OpenAI Moderation API (easy to plug in)
  4. If either guard fails, respond with a safe fallback.

🏗 Extending • UI — swap cli.ts for a small Next.js frontend. • Scale — replace MemoryVectorStore with Pinecone. • Docs — add more TESTS and bump thresholds as needed.

Made with ❤️ in ~2 hours. Ping me if you hit any bumps!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published