Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,13 @@
"- Peer Review : []()\n",
"- This is a part of [LangChain Open Tutorial](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial)\n",
"\n",
"[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/99-TEMPLATE/00-BASE-TEMPLATE-EXAMPLE.ipynb) [![Open in GitHub](https://img.shields.io/badge/Open%20in%20GitHub-181717?style=flat-square&logo=github&logoColor=white)](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/99-TEMPLATE/00-BASE-TEMPLATE-EXAMPLE.ipynb)\n",
"[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/10-Retriever/09-TimeWeightedVectorStoreRetriever.ipynb) [![Open in GitHub](https://img.shields.io/badge/Open%20in%20GitHub-181717?style=flat-square&logo=github&logoColor=white)](https://github.com/LangChain-OpenTutorial/LangChain-OpenTutorial/blob/main/10-Retriever/09-TimeWeightedVectorStoreRetriever.ipynb)\n",
"\n",
"## Overview\n",
"\n",
"`TimeWeightedVectorStoreRetriever` is a retriever that uses a combination of semantic similarity and a time decay. \n",
"\n",
"By doing so, it considers both the \" **freshness** \" and \" **relevance** \" of documents or data in its results.\n",
"By doing so, it considers both the \" **freshness** \" and \" **relevance** \" of the documents or data in its results.\n",
"\n",
"The algorithm for scoring them is: \n",
"\n",
Expand All @@ -29,7 +29,7 @@
"\n",
"The key feature of this approach is that it evaluates the “ **freshness of information** ” based on the last time the object was accessed. \n",
"\n",
"In other words, **objects that are accessed frequently maintain a high score** over time, increasing the likelihood that **frequently used or important information will appear near the top** of search results. This allows the retriever to provide dynamic results that account for both recency and relevance.\n",
"In other words, **objects that are accessed frequently maintain a higher score** over time, increasing the likelihood that **frequently used or important information will appear near the top** of search results. This allows the retriever to provide dynamic results that account for both recency and relevance.\n",
"\n",
"Importantly, in this context, `decay_rate` is determined by the **time since the object was last accessed** , not since it was created. \n",
"\n",
Expand All @@ -41,13 +41,13 @@
"- [Environment Setup](#environment-setup)\n",
"- [Low decay_rate](#low-decay_rate)\n",
"- [High decay_rate](#high-decay_rate)\n",
"- [decay_rate overview](#decay_rate-overview)\n",
"- [Adjusting the decay_rate with mocked time](#adjusting-the-decay_rate-with-mocked-time)\n",
"- [Summary of the decay_rate](#summary-of-the-decay_rate)\n",
"- [Testing with Virtual time](#testing-with-virtual-time)\n",
"\n",
"### References\n",
"\n",
"- [Time-weighted vector store retriever](https://python.langchain.com/docs/how_to/time_weighted_vectorstore/)\n",
"- [TimeWeightVectorStoreRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.time_weighted_retriever.TimeWeightedVectorStoreRetriever.html)\n",
"- [TimeWeightedVectorStoreRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.time_weighted_retriever.TimeWeightedVectorStoreRetriever.html)\n",
"- [mock_now](https://python.langchain.com/api_reference/core/utils/langchain_core.utils.utils.mock_now.html)\n",
"----"
]
Expand Down Expand Up @@ -120,7 +120,7 @@
" \"LANGCHAIN_API_KEY\": \"\",\n",
" \"LANGCHAIN_TRACING_V2\": \"true\",\n",
" \"LANGCHAIN_ENDPOINT\": \"https://api.smith.langchain.com\",\n",
" \"LANGCHAIN_PROJECT\": \"TimeWeightVectorStoreRetriever\",\n",
" \"LANGCHAIN_PROJECT\": \"TimeWeightedVectorStoreRetriever\",\n",
" }\n",
")"
]
Expand Down Expand Up @@ -161,7 +161,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Initializing the `TimeWeightedVectorStoreRetriever` with a very small `decay_rate` and k=1 (where k is the number of vectors to retrieve)."
"Let's first initialize the `TimeWeightedVectorStoreRetriever` with a very small `decay_rate` and `k=1` (where `k` is the number of vectors to retrieve)."
]
},
{
Expand All @@ -187,7 +187,7 @@
"index = faiss.IndexFlatL2(embedding_size)\n",
"vectorstore = FAISS(embeddings_model, index, InMemoryDocstore({}), {})\n",
"\n",
"# Initialize the time-weighted vector store retriever. (in here, we'll apply with a very small decay_rate)\n",
"# Initialize the time-weighted vector store retriever. (Here, we'll apply a very small decay_rate)\n",
"retriever = TimeWeightedVectorStoreRetriever(\n",
" vectorstore=vectorstore, decay_rate=0.0000000000000000000000001, k=1\n",
")"
Expand Down Expand Up @@ -272,9 +272,9 @@
"source": [
"## High decay_rate\n",
"\n",
"When a high `decay_rate` is used (e.g., 0.9999...), the `recency score` rapidly converges to 0.\n",
"When a high `decay_rate` is used (e.g., 0.9999...), the **recency score** rapidly converges to 0.\n",
"\n",
"(If this value were set to 1, all objects would end up with a `recency` value of 0, resulting in the same outcome as a standard vector lookup.)\n",
"If this value were set to 1, all objects would end up with a `recency` value of 0, resulting in the same outcome as a standard vector lookup. \n",
"\n",
"Initialize the retriever using `TimeWeightedVectorStoreRetriever` , setting the `decay_rate` to 0.999 to adjust the time-based weight decay rate."
]
Expand Down Expand Up @@ -371,22 +371,22 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## decay_rate overview\n",
"## Summary of the decay_rate\n",
"\n",
"- when `decay_rate` is set to a very small value, such as 0.000001:\n",
"- when the `decay_rate` is set to a very small value, such as 0.000001:\n",
" - The decay rate (i.e., the rate at which information is forgotten) is extremely low, so information is hardly forgotten.\n",
" - As a result, **there is almost no difference in time-based weights between recent and older information** . In this case, similarity scores are given higher priority.\n",
" - As a result, there is almost **no difference in time-based weights between more or less recently accessed information** . In this case, similarity scores are given higher priority.\n",
"\n",
"- When `decay_rate` is set close to 1, such as 0.999:\n",
" - The decay rate is very high, so most past information is almost completely forgotten.\n",
" - As a result, in such cases, higher scores are given to more recent information.\n"
"- When the `decay_rate` is set close to 1, such as 0.999:\n",
" - The decay rate is very high, so most of the recently unaccessed information is almost completely forgotten.\n",
" - As a result, in such cases, higher scores are given to more recently accessed information.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Adjusting the decay_rate with Mocked Time\n",
"## Testing with Virtual Time\n",
"\n",
"`LangChain` provides some utilities that allow you to test time-based components by mocking the current time.\n",
"\n",
Expand All @@ -398,7 +398,7 @@
"metadata": {},
"source": [
"[**NOTE**] \n",
"Inside the with statement, all `datetime.now()` calls return the **mocked time** . Once you **exit** the with block, it reverts back to the **original time** ."
"Inside the with statement, all `datetime.now` calls return the **mocked time** . Once you **exit** the with block, it reverts back to the **original time** ."
]
},
{
Expand Down
Loading