Create testing dataset
python croptalk/create_test_dataset.py
Run on the dataset
python evaluate_openai_functions.py
Check Langsmith datasets for the results.
This repo is an implementation of a locally hosted chatbot specifically focused on question answering over the LangChain documentation. Built with LangChain, FastAPI, and Next.js.
Deployed version: chat.langchain.com
The app leverages LangChain's streaming support and async API to update the page in real time for multiple users.
- Update lock
poetry lock - Install backend dependencies:
poetry install. - Make sure to enter your environment variables to configure the application:
export OPENAI_API_KEY=
export WEAVIATE_URL=
export WEAVIATE_API_KEY=
export RECORD_MANAGER_DB_URL=
# for tracing
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
export LANGCHAIN_API_KEY=
export LANGCHAIN_PROJECT=
- Run
python ingest.pyto ingest LangChain docs data into the Weaviate vectorstore (only needs to be done once).- You can use other Document Loaders to load your own data into the vectorstore.
- Start the Python backend with
poetry run make start. - Install frontend dependencies by running
cd chat-langchain, thenyarn. - Run the frontend with
yarn devfor frontend. - Open localhost:3000 in your browser.
-
Make sure you have all 3 secret files (
.env.secret,.env.share,dsmain_ssh_ec3) in foldersecrets -
Launch app:
docker compose -f ./docker-compose-local.yml up -d --build -
The app is now available at
http://localhost:3000/ -
To run tests, open a terminal:
-
go into running backend container:
docker exec -ti $(docker ps -qf "name=chat-langchain-backend") /bin/bash -
run tests:
python -m pytest tests -
run evaluation script whose options/args are:
root@e6123c0f9b6b:~# python _scripts/evaluate_doc_retrieval.py --help usage: evaluate_doc_retrieval.py [-h] [--use-model-llm] eval_path positional arguments: eval_path CSV file path that contains evaluation use cases options: -h, --help show this help message and exit --use-model-llm Option which, when specified, tells the evaluation to use model_llm (i.e. use model_openai_functions when this option is not specified)You can see an example of the script's expected input in
./_scripts/evaluate_doc_retrieval.csv. You can then usepandas.read_csv(<path>)to load the generated evaluation report (whose path is reported on the last line of the script).root@e6123c0f9b6b:~# python _scripts/evaluate_doc_retrieval.py --use-model-llm ./_scripts/evaluate_doc_retrieval.csv INFO:root:Evaluating croptalk's document retrieval capacity, using config: Namespace(use_model_llm=True, eval_path='./_scripts/evaluate_doc_retrieval.csv') INFO:root:Number of use cases to evaluate: 2 INFO:root:Creating output_df INFO:root:Loading model (...) INFO:root:Evaluation report/dataframe saved here: ./_scripts/evaluate_doc_retrieval__model_llm__2024-02-23T22:07:10.813830.csv
-
- Launch the container
docker-compose -f docker-compose-local.yml up -d --build - Attach to docker through VSCode Remote Explorer
- Open an .ipynb and select a Python kernel. Install python and jupyter if needed (they are not installed in the container by default)
With the running docker, execute the script (Modify dataset name if needed):
docker exec -it chat-langchain-backend-1 python _scripts/evaluate_overall_performance.py
There are two components: ingestion and question-answering.
Ingestion has the following steps:
- Pull html from documentation site as well as the Github Codebase
- Load html with LangChain's RecursiveURLLoader and SitemapLoader
- Split documents with LangChain's RecursiveCharacterTextSplitter
- Create a vectorstore of embeddings, using LangChain's Weaviate vectorstore wrapper (with OpenAI's embeddings).
Question-Answering has the following steps:
- Given the chat history and new user input, determine what a standalone question would be using GPT-3.5.
- Given that standalone question, look up relevant documents from the vectorstore.
- Pass the standalone question and relevant documents to the model to generate and stream the final answer.
- Generate a trace URL for the current chat session, as well as the endpoint to collect feedback.
Deploy the frontend Next.js app as a serverless Edge function on Vercel by clicking here.
You'll need to populate the NEXT_PUBLIC_API_BASE_URL environment variable with the base URL you've deployed the backend under (no trailing slash!).