diff --git a/16-Evaluations/06-LangSmith-Embedding-Distance-Evaluation.ipynb b/16-Evaluations/06-LangSmith-Embedding-Distance-Evaluation.ipynb new file mode 100644 index 000000000..048c1e3b1 --- /dev/null +++ b/16-Evaluations/06-LangSmith-Embedding-Distance-Evaluation.ipynb @@ -0,0 +1,347 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "635d8ebb", + "metadata": {}, + "source": [ + "# Embedding-based Evaluator(embedding_distance)\n", + "\n", + "- Author: [Youngjun Cho](https://github.com/choincnp)\n", + "- Design: \n", + "- Peer Review: \n", + "- This is a part of [LangChain Open Tutorial](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial)\n", + "\n", + "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/99-TEMPLATE/00-BASE-TEMPLATE-EXAMPLE.ipynb) [![Open in GitHub](https://img.shields.io/badge/Open%20in%20GitHub-181717?style=flat-square&logo=github&logoColor=white)](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/99-TEMPLATE/00-BASE-TEMPLATE-EXAMPLE.ipynb)\n", + "\n", + "## Overview\n", + "\n", + "The **Embedding-based Evaluator** (`embedding_distance`) part is designed to evaluate question-answering systems using various **embedding models** and **distance metrics** .\n", + "\n", + "### Table of Contents\n", + "\n", + "- [Overview](#overview)\n", + "- [Environment Setup](#environment-setup)\n", + "- [Defining functions for rag performance testing](#defining-functions-for-rag-performance-testing)\n", + "- [Embedding distance based evaluator](#embedding-distance-based-evaluator)\n", + "\n", + "### References\n", + "\n", + "- [LangChain OpenAIEmbeddings](https://python.langchain.com/api_reference/openai/embeddings/langchain_openai.embeddings.base.OpenAIEmbeddings.html)\n", + "- [LangChain StringEvaluator](https://python.langchain.com/api_reference/langchain/evaluation/langchain.evaluation.schema.StringEvaluator.html)\n", + "\n", + "----" + ] + }, + { + "cell_type": "markdown", + "id": "c6c7aba4", + "metadata": {}, + "source": [ + "## Environment Setup\n", + "\n", + "Setting up your environment is the first step. See the [Environment Setup](https://wikidocs.net/257836) guide for more details.\n", + "\n", + "\n", + "**[Note]**\n", + "\n", + "The langchain-opentutorial is a package of easy-to-use environment setup guidance, useful functions and utilities for tutorials.\n", + "Check out the [`langchain-opentutorial`](https://github.com/LangChain-OpenTutorial/langchain-opentutorial-pypi) for more details." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "21943adb", + "metadata": {}, + "outputs": [], + "source": [ + "%%capture --no-stderr\n", + "%pip install langchain-opentutorial" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "f25ec196", + "metadata": {}, + "outputs": [], + "source": [ + "# Install required packages\n", + "from langchain_opentutorial import package\n", + "\n", + "package.install(\n", + " [\n", + " \"langsmith\",\n", + " \"langchain\",\n", + " \"langchain_core\",\n", + " \"langchain_community\",\n", + " \"langchain_openai\",\n", + " \"langchain_upstage\",\n", + " \"PyMuPDF\"\n", + " ],\n", + " verbose=False,\n", + " upgrade=False,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "690a9ae0", + "metadata": {}, + "source": [ + "You can set API keys in a `.env` file or set them manually.\n", + "\n", + "[Note] If you’re not using the `.env` file, no worries! Just enter the keys directly in the cell below, and you’re good to go." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "327c2c7c", + "metadata": {}, + "outputs": [], + "source": [ + "from dotenv import load_dotenv\n", + "from langchain_opentutorial import set_env\n", + "\n", + "# Attempt to load environment variables from a .env file; if unsuccessful, set them manually.\n", + "if not load_dotenv():\n", + " set_env(\n", + " {\n", + " \"OPENAI_API_KEY\": \"\",\n", + " \"LANGCHAIN_API_KEY\": \"\",\n", + " \"UPSTAGE_API_KEY\": \"\",\n", + " \"LANGCHAIN_TRACING_V2\": \"true\",\n", + " \"LANGCHAIN_ENDPOINT\": \"https://api.smith.langchain.com\",\n", + " \"LANGCHAIN_PROJECT\": \"06-LangSmith-Embedding-Distance-Evaluation\", # set the project name same as the title\n", + " }\n", + " )" + ] + }, + { + "cell_type": "markdown", + "id": "aa00c3f4", + "metadata": {}, + "source": [ + "## Defining Functions for RAG Performance Testing\n", + "\n", + "We will create a RAG system for testing purposes." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "69cb77da", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'Agents differ from standalone language models in that agents extend the capabilities of language models by leveraging tools to access real-time information, suggest real-world actions, and plan and execute complex tasks autonomously. While standalone language models are limited to the knowledge available in their training data, agents can enhance their knowledge through connections to external resources and tools, allowing them to perform more dynamic and complex functions.'" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from myrag import PDFRAG\n", + "from langchain_openai import ChatOpenAI\n", + "\n", + "# Create a PDFRAG object\n", + "rag = PDFRAG(\n", + " \"data/Newwhitepaper_Agents2.pdf\",\n", + " ChatOpenAI(model=\"gpt-4o-mini\", temperature=0),\n", + ")\n", + "\n", + "# Create a retriever\n", + "retriever = rag.create_retriever()\n", + "\n", + "# Create a chain\n", + "chain = rag.create_chain(retriever)\n", + "\n", + "# Generate an answer for a question\n", + "chain.invoke(\"How do agents differ from standalone language models?\")" + ] + }, + { + "cell_type": "markdown", + "id": "29a450f7", + "metadata": {}, + "source": [ + "Create a function named `ask_question` to handle answering questions. The function takes a dictionary `inputs` as input and returns a dictionary `answer` as output." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "1e46025f", + "metadata": {}, + "outputs": [], + "source": [ + "# Create a function to answer questions\n", + "def ask_question(inputs: dict):\n", + " return {\"answer\": chain.invoke(inputs[\"question\"])}" + ] + }, + { + "cell_type": "markdown", + "id": "3e187980", + "metadata": {}, + "source": [ + "## Embedding Distance-based Evaluator" + ] + }, + { + "cell_type": "markdown", + "id": "ddc54d01", + "metadata": {}, + "source": [ + "We will build a system for evaluating sentence similarity using various embedding models and distance metrics. \n", + "\n", + "The code below defines configurations for each model and metric using the `LangChainStringEvaluator`.\n", + "\n", + "[ **Note** ] \n", + "For LangChainStringEvaluator, `OpenAIEmbeddings` is set as the default, but it can be changed." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "2ba16e5b", + "metadata": {}, + "outputs": [], + "source": [ + "from langsmith.evaluation import LangChainStringEvaluator\n", + "from langchain_upstage import UpstageEmbeddings\n", + "from langchain_openai import OpenAIEmbeddings\n", + "\n", + "# Create an embedding model evaluator\n", + "openai_embedding_cosine_evaluator = LangChainStringEvaluator(\n", + " \"embedding_distance\",\n", + " config={\n", + " # OpenAIEmbeddings is set as the default, but can be changed\n", + " \"embeddings\": OpenAIEmbeddings(model=\"text-embedding-3-small\"),\n", + " \"distance_metric\": \"cosine\", # \"cosine\", \"euclidean\", \"chebyshev\", \"hamming\", and \"manhattan\"\n", + " },\n", + ")\n", + "\n", + "upstage_embedding_evaluator = LangChainStringEvaluator(\n", + " \"embedding_distance\",\n", + " config={\n", + " # OpenAIEmbeddings is set as the default, but can be changed\n", + " \"embeddings\": UpstageEmbeddings(model=\"embedding-query\"),\n", + " \"distance_metric\": \"euclidean\", # \"cosine\", \"euclidean\", \"chebyshev\", \"hamming\", and \"manhattan\"\n", + " },\n", + ")\n", + "\n", + "openai_embedding_evaluator = LangChainStringEvaluator(\n", + " \"embedding_distance\",\n", + " config={\n", + " # OpenAIEmbeddings is set as the default, but can be changed\n", + " \"embeddings\": OpenAIEmbeddings(model=\"text-embedding-3-small\"),\n", + " \"distance_metric\": \"euclidean\", # \"cosine\", \"euclidean\", \"chebyshev\", \"hamming\", and \"manhattan\"\n", + " },\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "10e94e48", + "metadata": {}, + "source": [ + "When multiple embedding models are used for **one metric** , the results are averaged.\n", + "\n", + "Example:\n", + "- `cosine` : OpenAI\n", + "- `euclidean` : OpenAI, Upstage\n", + "\n", + "For `euclidean` , the average value across the models is calculated." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "de80a4f6", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "View the evaluation results for experiment: 'EMBEDDING-EVAL-e7657248' at:\n", + "https://smith.langchain.com/o/9089d1d3-e786-4000-8468-66153f05444b/datasets/9b4ca107-33fe-4c71-bb7f-488272d895a3/compare?selectedSessions=43f0123f-de7a-4434-ab59-b4ff06134982\n", + "\n", + "\n" + ] + }, + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "a876d0a50ece466baec5a3d834a63a70", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "0it [00:00, ?it/s]" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "from langsmith.evaluation import evaluate\n", + "\n", + "dataset_name = \"RAG_EVAL_DATASET\"\n", + "\n", + "# Run evaluation\n", + "experiment_results = evaluate(\n", + " ask_question,\n", + " data=dataset_name,\n", + " evaluators=[\n", + " openai_embedding_cosine_evaluator,\n", + " upstage_embedding_evaluator,\n", + " openai_embedding_evaluator,\n", + " ],\n", + " experiment_prefix=\"EMBEDDING-EVAL\",\n", + " # Specify experiment metadata\n", + " metadata={\n", + " \"variant\": \"Evaluation using embedding_distance\",\n", + " },\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "c3246905", + "metadata": {}, + "source": [ + "![](./assets/06-langSmith-embedding-distance-evaluation-01.png)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "langchain-kr-lwwSZlnu-py3.11", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.11" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/16-Evaluations/assets/06-langsmith-embedding-distance-evaluation-01.png b/16-Evaluations/assets/06-langsmith-embedding-distance-evaluation-01.png new file mode 100644 index 000000000..3d754be68 Binary files /dev/null and b/16-Evaluations/assets/06-langsmith-embedding-distance-evaluation-01.png differ