Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
309 changes: 309 additions & 0 deletions 08-EMBEDDING/05-OllamaEmbeddings.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,309 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Ollama Embeddings With Langchain\n",
"\n",
"- Author: [Gwangwon Jung](https://github.com/pupba)\n",
"- Design: []()\n",
"- This is a part of [LangChain Open Tutorial](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial)\n",
"\n",
"[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain-academy/blob/main/module-4/sub-graph.ipynb) [![Open in LangChain Academy](https://cdn.prod.website-files.com/65b8cd72835ceeacd4449a53/66e9eba12c7b7688aa3dbb5e_LCA-badge-green.svg)](https://academy.langchain.com/courses/take/intro-to-langgraph/lessons/58239937-lesson-2-sub-graphs)\n",
"\n",
"## Overview\n",
"\n",
"This tutorial covers how to perform `Text Embedding` using `Ollama` and `Langchain`.\n",
"\n",
"`Ollama` is an open-source project that allows you to easily serve models locally.\n",
"\n",
"In this tutorial, we will create a simple example to measure the similarity between `Documents` and an input `Query` using `Ollama` and `Langchain`.\n",
"\n",
"![example](./assets/example-flow-ollama-embedding-cal-similarity.png)\n",
"\n",
"### Table of Contents\n",
"\n",
"- [Overview](#overview)\n",
"- [Environement Setup](#environment-setup)\n",
"- [Ollama Install and Model Serving](#ollama-install-and-model-serving)\n",
"- [Identify Supported Embedding Models and Serving Model](#identify-supported-embedding-models-and-serving-model)\n",
"- [Model Load and Embedding](#model-load-and-embedding)\n",
"- [The similarity calculation results](#the-similarity-calculation-results)\n",
"\n",
"### References\n",
"\n",
"- [Ollama](https://ollama.com/)\n",
"- [Cosine Similarity](https://en.wikipedia.org/wiki/Cosine_similarity)\n",
"----"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Environment Setup\n",
"\n",
"Set up the environment. You may refer to [Environment Setup](https://wikidocs.net/257836) for more details.\n",
"\n",
"**[Note]**\n",
"- `langchain-opentutorial` is a package that provides a set of easy-to-use environment setup, useful functions and utilities for tutorials. \n",
"- You can checkout the [`langchain-opentutorial`](https://github.com/LangChain-OpenTutorial/langchain-opentutorial-pypi) for more details."
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {},
"outputs": [],
"source": [
"%%capture --no-stderr\n",
"%pip install langchain-opentutorial"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {},
"outputs": [],
"source": [
"# Install required packages\n",
"from langchain_opentutorial import package\n",
"\n",
"package.install(\n",
" [\n",
" \"langchain_community\",\n",
" \"langchain-ollama\",\n",
" \"scikit-learn\",\n",
" ],\n",
" verbose=False,\n",
" upgrade=False,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Ollama Install and Model Serving\n",
"\n",
"Ollama is an open-source project that makes it easy to run large language models(LLM) in a local environment. This tool allows users to download and run various LLMs with simple commands, enabling developers to experiment with and use AI models directly on their computers. Ollama is a tool with a user-friendly interface and fast performance, making AI development and experimentation more accessible and efficient.\n",
"\n",
"- [Official Website/Installation](https://ollama.com/)\n",
"\n",
"### Ollama User Guide\n",
"**1. Installation Ollama** <br>\n",
" ![guide1](./assets/guide1.png)\n",
"\n",
"**2. Verify Ollama Installation** <br>\n",
" ![guide2](./assets/guide2.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Identify Supported Embedding Models and Serving Model\n",
"\n",
"👇 You can find the model in the hyperlink below.\n",
"\n",
"- [Ollama Models](https://ollama.com/search)\n",
"\n",
"### Ollama Model Pull Guide\n",
"\n",
"**1. Search Models** <br>\n",
"\n",
"![guide3](./assets/guide3.png)\n",
"\n",
"**2. Pull a Model** <br>\n",
"\n",
"![guide4](./assets/guide4.png)\n",
"\n",
"**3. Verify the Model** <br>\n",
"\n",
"![guide5](./assets/guide5.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Model Load and Embedding\n",
"\n",
"Now that you have downloaded the model, let's load the model you downloaded and proceed with the embedding.\n",
"\n",
"First, define `Query` and `Documents`"
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {},
"outputs": [],
"source": [
"# Query\n",
"q = \"Please tell me more about LangChain.\"\n",
"\n",
"# Documents for Text Embedding\n",
"docs = [\n",
" \"Hi, nice to meet you.\",\n",
" \"LangChain simplifies the process of building applications with large language models.\",\n",
" \"The LangChain English tutorial is structured based on LangChain's official documentation, cookbook, and various practical examples to help users utilize LangChain more easily and effectively.\",\n",
" \"LangChain simplifies the process of building applications with large-scale language models.\",\n",
" \"Retrieval-Augmented Generation (RAG) is an effective technique for improving AI responses.\",\n",
"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, let's load the embedding model downloaded with `Ollama` using `Langchain`.\n",
"\n",
"\n",
"The `OllamaEmbeddings` class in `langchain_community/embeddings.py` will be removed in `langchain-community` version 1.0.0."
]
},
{
"cell_type": "code",
"execution_count": 69,
"metadata": {},
"outputs": [],
"source": [
"# Load Embedding Model : Legacy\n",
"from langchain_community.embeddings import OllamaEmbeddings\n",
"\n",
"# Serving and load the desired embedding model.\n",
"ollama_embeddings = OllamaEmbeddings(\n",
" model=\"nomic-embed-text\", # model=<model-name>\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So, in this tutorial,we used the `OllamaEmbeddings` class from `langchain-ollama`."
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {},
"outputs": [],
"source": [
"# Load Embedding Model : Latest\n",
"from langchain_ollama import OllamaEmbeddings\n",
"\n",
"# Serving and load the desired embedding model.\n",
"ollama_embeddings = OllamaEmbeddings(\n",
" model=\"nomic-embed-text\", # model=<model-name>\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's use the loaded model to embed the `Query` and `Documents`."
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Embedding Dimension Output: 768\n"
]
}
],
"source": [
"# Embedding Query\n",
"embedded_query = ollama_embeddings.embed_query(q)\n",
"\n",
"# Embedding Documents\n",
"embedded_docs = ollama_embeddings.embed_documents(docs)\n",
"\n",
"print(f\"Embedding Dimension Output: {len(embedded_query)}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The similarity calculation results\n",
"\n",
"Let's use the vector values of the query and documents obtained earlier to calculate the similarity.\n",
"\n",
"In this tutorial, we will use [cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity) to calculate the similarity between the `Query` and the `Documents`.\n",
"\n",
"Using the `Sklearn` library in Python, you can easily calculate **cosine similarity**."
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[Query] Tell me about LangChain.\n",
"====================================\n",
"[0] similarity: 0.775 | The LangChain English tutorial is structured based on LangChain's official documentation, cookbook, and various practical examples to help users utilize LangChain more easily and effectively.\n",
"\n",
"[1] similarity: 0.748 | LangChain simplifies the process of building applications with large language models.\n",
"\n",
"[2] similarity: 0.745 | LangChain simplifies the process of building applications with large-scale language models.\n",
"\n",
"[3] similarity: 0.399 | Retrieval-Augmented Generation (RAG) is an effective technique for improving AI responses.\n",
"\n",
"[4] similarity: 0.398 | Hi, nice to meet you.\n",
"\n"
]
}
],
"source": [
"from sklearn.metrics.pairwise import cosine_similarity\n",
"\n",
"# Calculate Cosine Similarity\n",
"similarity = cosine_similarity([embedded_query], embedded_docs)\n",
"\n",
"# Sorting by Similarity in descending order\n",
"sorted_idx = similarity.argsort()[0][::-1]\n",
"\n",
"# Output Result\n",
"print(\"[Query] Tell me about LangChain.\\n====================================\")\n",
"for i, idx in enumerate(sorted_idx):\n",
" print(f\"[{i}] similarity: {similarity[0][idx]:.3f} | {docs[idx]}\")\n",
" print()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "test",
"language": "python",
"name": "langchain"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.11"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added 08-EMBEDDING/assets/guide1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added 08-EMBEDDING/assets/guide2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added 08-EMBEDDING/assets/guide3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added 08-EMBEDDING/assets/guide4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added 08-EMBEDDING/assets/guide5.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.