-
Notifications
You must be signed in to change notification settings - Fork 282
[E-2] 16-Evaluations / 06-LangSmith-Embedding-Distance-Evaluation #384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
30ff8ec
feat: First version of Embedding-Distance-Evaluation tutorial
choincnp 71e8357
Merge branch 'LangChain-OpenTutorial:main' into tutorial
choincnp 09d4c8c
fix: changed UpstageEmbeddings to OpenAIEmbedding to prevent kernel c…
choincnp d7d6a0f
chore: delete original uppercase file
choincnp File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
347 changes: 347 additions & 0 deletions
347
16-Evaluations/06-LangSmith-Embedding-Distance-Evaluation.ipynb
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,347 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "635d8ebb", | ||
"metadata": {}, | ||
"source": [ | ||
"# Embedding-based Evaluator(embedding_distance)\n", | ||
"\n", | ||
"- Author: [Youngjun Cho](https://github.com/choincnp)\n", | ||
"- Design: \n", | ||
"- Peer Review: \n", | ||
"- This is a part of [LangChain Open Tutorial](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial)\n", | ||
"\n", | ||
"[](https://colab.research.google.com/github/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/99-TEMPLATE/00-BASE-TEMPLATE-EXAMPLE.ipynb) [](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/99-TEMPLATE/00-BASE-TEMPLATE-EXAMPLE.ipynb)\n", | ||
"\n", | ||
"## Overview\n", | ||
"\n", | ||
"The **Embedding-based Evaluator** (`embedding_distance`) part is designed to evaluate question-answering systems using various **embedding models** and **distance metrics** .\n", | ||
"\n", | ||
"### Table of Contents\n", | ||
"\n", | ||
"- [Overview](#overview)\n", | ||
"- [Environment Setup](#environment-setup)\n", | ||
"- [Defining functions for rag performance testing](#defining-functions-for-rag-performance-testing)\n", | ||
"- [Embedding distance based evaluator](#embedding-distance-based-evaluator)\n", | ||
"\n", | ||
"### References\n", | ||
"\n", | ||
"- [LangChain OpenAIEmbeddings](https://python.langchain.com/api_reference/openai/embeddings/langchain_openai.embeddings.base.OpenAIEmbeddings.html)\n", | ||
"- [LangChain StringEvaluator](https://python.langchain.com/api_reference/langchain/evaluation/langchain.evaluation.schema.StringEvaluator.html)\n", | ||
"\n", | ||
"----" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "c6c7aba4", | ||
"metadata": {}, | ||
"source": [ | ||
"## Environment Setup\n", | ||
"\n", | ||
"Setting up your environment is the first step. See the [Environment Setup](https://wikidocs.net/257836) guide for more details.\n", | ||
"\n", | ||
"\n", | ||
"**[Note]**\n", | ||
"\n", | ||
"The langchain-opentutorial is a package of easy-to-use environment setup guidance, useful functions and utilities for tutorials.\n", | ||
"Check out the [`langchain-opentutorial`](https://github.com/LangChain-OpenTutorial/langchain-opentutorial-pypi) for more details." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"id": "21943adb", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"%%capture --no-stderr\n", | ||
"%pip install langchain-opentutorial" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"id": "f25ec196", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Install required packages\n", | ||
"from langchain_opentutorial import package\n", | ||
"\n", | ||
"package.install(\n", | ||
" [\n", | ||
" \"langsmith\",\n", | ||
" \"langchain\",\n", | ||
" \"langchain_core\",\n", | ||
" \"langchain_community\",\n", | ||
" \"langchain_openai\",\n", | ||
" \"langchain_upstage\",\n", | ||
" \"PyMuPDF\"\n", | ||
" ],\n", | ||
" verbose=False,\n", | ||
" upgrade=False,\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "690a9ae0", | ||
"metadata": {}, | ||
"source": [ | ||
"You can set API keys in a `.env` file or set them manually.\n", | ||
"\n", | ||
"[Note] If you’re not using the `.env` file, no worries! Just enter the keys directly in the cell below, and you’re good to go." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 3, | ||
"id": "327c2c7c", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from dotenv import load_dotenv\n", | ||
"from langchain_opentutorial import set_env\n", | ||
"\n", | ||
"# Attempt to load environment variables from a .env file; if unsuccessful, set them manually.\n", | ||
"if not load_dotenv():\n", | ||
" set_env(\n", | ||
" {\n", | ||
" \"OPENAI_API_KEY\": \"\",\n", | ||
" \"LANGCHAIN_API_KEY\": \"\",\n", | ||
" \"UPSTAGE_API_KEY\": \"\",\n", | ||
" \"LANGCHAIN_TRACING_V2\": \"true\",\n", | ||
" \"LANGCHAIN_ENDPOINT\": \"https://api.smith.langchain.com\",\n", | ||
" \"LANGCHAIN_PROJECT\": \"06-LangSmith-Embedding-Distance-Evaluation\", # set the project name same as the title\n", | ||
" }\n", | ||
" )" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "aa00c3f4", | ||
"metadata": {}, | ||
"source": [ | ||
"## Defining Functions for RAG Performance Testing\n", | ||
"\n", | ||
"We will create a RAG system for testing purposes." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 4, | ||
"id": "69cb77da", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"'Agents differ from standalone language models in that agents extend the capabilities of language models by leveraging tools to access real-time information, suggest real-world actions, and plan and execute complex tasks autonomously. While standalone language models are limited to the knowledge available in their training data, agents can enhance their knowledge through connections to external resources and tools, allowing them to perform more dynamic and complex functions.'" | ||
] | ||
}, | ||
"execution_count": 4, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"from myrag import PDFRAG\n", | ||
"from langchain_openai import ChatOpenAI\n", | ||
"\n", | ||
"# Create a PDFRAG object\n", | ||
"rag = PDFRAG(\n", | ||
" \"data/Newwhitepaper_Agents2.pdf\",\n", | ||
" ChatOpenAI(model=\"gpt-4o-mini\", temperature=0),\n", | ||
")\n", | ||
"\n", | ||
"# Create a retriever\n", | ||
"retriever = rag.create_retriever()\n", | ||
"\n", | ||
"# Create a chain\n", | ||
"chain = rag.create_chain(retriever)\n", | ||
"\n", | ||
"# Generate an answer for a question\n", | ||
"chain.invoke(\"How do agents differ from standalone language models?\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "29a450f7", | ||
"metadata": {}, | ||
"source": [ | ||
"Create a function named `ask_question` to handle answering questions. The function takes a dictionary `inputs` as input and returns a dictionary `answer` as output." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 5, | ||
"id": "1e46025f", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Create a function to answer questions\n", | ||
"def ask_question(inputs: dict):\n", | ||
" return {\"answer\": chain.invoke(inputs[\"question\"])}" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "3e187980", | ||
"metadata": {}, | ||
"source": [ | ||
"## Embedding Distance-based Evaluator" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "ddc54d01", | ||
"metadata": {}, | ||
"source": [ | ||
"We will build a system for evaluating sentence similarity using various embedding models and distance metrics. \n", | ||
"\n", | ||
"The code below defines configurations for each model and metric using the `LangChainStringEvaluator`.\n", | ||
"\n", | ||
"[ **Note** ] \n", | ||
"For LangChainStringEvaluator, `OpenAIEmbeddings` is set as the default, but it can be changed." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 6, | ||
"id": "2ba16e5b", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from langsmith.evaluation import LangChainStringEvaluator\n", | ||
"from langchain_upstage import UpstageEmbeddings\n", | ||
"from langchain_openai import OpenAIEmbeddings\n", | ||
"\n", | ||
"# Create an embedding model evaluator\n", | ||
"openai_embedding_cosine_evaluator = LangChainStringEvaluator(\n", | ||
" \"embedding_distance\",\n", | ||
" config={\n", | ||
" # OpenAIEmbeddings is set as the default, but can be changed\n", | ||
" \"embeddings\": OpenAIEmbeddings(model=\"text-embedding-3-small\"),\n", | ||
" \"distance_metric\": \"cosine\", # \"cosine\", \"euclidean\", \"chebyshev\", \"hamming\", and \"manhattan\"\n", | ||
" },\n", | ||
")\n", | ||
"\n", | ||
"upstage_embedding_evaluator = LangChainStringEvaluator(\n", | ||
" \"embedding_distance\",\n", | ||
" config={\n", | ||
" # OpenAIEmbeddings is set as the default, but can be changed\n", | ||
" \"embeddings\": UpstageEmbeddings(model=\"embedding-query\"),\n", | ||
" \"distance_metric\": \"euclidean\", # \"cosine\", \"euclidean\", \"chebyshev\", \"hamming\", and \"manhattan\"\n", | ||
" },\n", | ||
")\n", | ||
"\n", | ||
"openai_embedding_evaluator = LangChainStringEvaluator(\n", | ||
" \"embedding_distance\",\n", | ||
" config={\n", | ||
" # OpenAIEmbeddings is set as the default, but can be changed\n", | ||
" \"embeddings\": OpenAIEmbeddings(model=\"text-embedding-3-small\"),\n", | ||
" \"distance_metric\": \"euclidean\", # \"cosine\", \"euclidean\", \"chebyshev\", \"hamming\", and \"manhattan\"\n", | ||
" },\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "10e94e48", | ||
"metadata": {}, | ||
"source": [ | ||
"When multiple embedding models are used for **one metric** , the results are averaged.\n", | ||
"\n", | ||
"Example:\n", | ||
"- `cosine` : OpenAI\n", | ||
"- `euclidean` : OpenAI, Upstage\n", | ||
"\n", | ||
"For `euclidean` , the average value across the models is calculated." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 7, | ||
"id": "de80a4f6", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"View the evaluation results for experiment: 'EMBEDDING-EVAL-e7657248' at:\n", | ||
"https://smith.langchain.com/o/9089d1d3-e786-4000-8468-66153f05444b/datasets/9b4ca107-33fe-4c71-bb7f-488272d895a3/compare?selectedSessions=43f0123f-de7a-4434-ab59-b4ff06134982\n", | ||
"\n", | ||
"\n" | ||
] | ||
}, | ||
{ | ||
"data": { | ||
"application/vnd.jupyter.widget-view+json": { | ||
"model_id": "a876d0a50ece466baec5a3d834a63a70", | ||
"version_major": 2, | ||
"version_minor": 0 | ||
}, | ||
"text/plain": [ | ||
"0it [00:00, ?it/s]" | ||
] | ||
}, | ||
"metadata": {}, | ||
"output_type": "display_data" | ||
} | ||
], | ||
"source": [ | ||
"from langsmith.evaluation import evaluate\n", | ||
"\n", | ||
"dataset_name = \"RAG_EVAL_DATASET\"\n", | ||
"\n", | ||
"# Run evaluation\n", | ||
"experiment_results = evaluate(\n", | ||
" ask_question,\n", | ||
" data=dataset_name,\n", | ||
" evaluators=[\n", | ||
" openai_embedding_cosine_evaluator,\n", | ||
" upstage_embedding_evaluator,\n", | ||
" openai_embedding_evaluator,\n", | ||
" ],\n", | ||
" experiment_prefix=\"EMBEDDING-EVAL\",\n", | ||
" # Specify experiment metadata\n", | ||
" metadata={\n", | ||
" \"variant\": \"Evaluation using embedding_distance\",\n", | ||
" },\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "c3246905", | ||
"metadata": {}, | ||
"source": [ | ||
"" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "langchain-kr-lwwSZlnu-py3.11", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.11" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
Binary file added
BIN
+103 KB
16-Evaluations/assets/06-langsmith-embedding-distance-evaluation-01.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pymupdf
패키지를 install 하는 코드를 추가해야 될것 같습니다.PyMuPDF
package not found, please install it withpip install pymupdf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
친절한 리뷰 감사합니다! 추가해서 넣도록 하겠습니다!!