document-retrieval

Star

Here are 96 public repositories matching this topic...

chroma-core / chroma

Star

Open-source search and retrieval database for AI applications.

rust database ai embeddings rust-lang document-retrieval rag vector-database llm llms

Updated Oct 22, 2025
Rust

OpenBMB / VisRAG

Star

Parsing-free RAG supported by VLMs

retrieval multi-modal document-retrieval rag multi-modality document-understanding vision-language-model retrieval-augmented-generation

Updated Oct 22, 2025
Python

francescobrigante / Enterprise-RAG

Star

Specialized Retrieval Augmented Generation pipeline designed for Corporate Documents such as CBA/CCNL, Company Regulations, and Business Policies

retrieval chunking pdf-parsing document-retrieval rag ccnl company-rag company-documents document-routing corporate-rag

Updated Oct 21, 2025
Jupyter Notebook

ruoheng-du / healthcare-doc-analysis-rag-llm

Star

[KPMG x Columbia] Intelligent Document Analysis for Healthcare Programs Using LLMs and RAG | Fall 2025

document-retrieval rag llm

Updated Oct 21, 2025
Python

PRITHIVSAKTHIUR / Multimodal-OCR2

Star

A comprehensive multimodal OCR application that supports both image and video document processing using state-of-the-art vision-language models. This application provides an intuitive Gradio interface for extracting text, converting documents to markdown, and performing advanced document analysis.

pillow image-analysis gradio video-understanding document-retrieval ocr-recognition huggingface-transformers vision-transformer qwen2-5-vl smoldocling

Updated Oct 16, 2025
Python

spiliossp / Information-Retrieval

Star

An Information Retrieval engine for scientific papers – Lucene-powered with synonyms, wildcards, and smart query expansion.

python java search-engine natural-language-processing information-retrieval text-mining data-cleaning nlp-machine-learning query-expansion document-retrieval kaggle-dataset scientific-papers wildcard-searches apache-lucene research-papers search-history

Updated Oct 15, 2025
Java

vearch / vearch

Star

Distributed vector search for AI-native applications

embeddings cloud-native vectors document-retrieval rag vector-search vector-database hybrid-search ai-native retrieval-augmented-generation ai-native-database

Updated Oct 15, 2025
Go

Md-Emon-Hasan / AutoDocThinker

Star

Agentic AI system that allows users to upload documents (PDFs, DOCX, etc.) and natural language questions. It uses LLM-based RAG to extract relevant information. The architecture includes multi-agent components such as document retrievers, summarizers, web searchers, and tool routers — enabling dynamic reasoning and accurate responses.

Updated Oct 3, 2025
Jupyter Notebook

LouisDo2108 / PromptDSI

Star

[ECML PKDD 2025] Official implementation of "PromptDSI: Prompt-based Rehearsal-free Continual Learning for Document Retrieval"

information-retrieval document-retrieval continual-learning bertopic prompt-tuning rehearsal-free

Updated Sep 11, 2025
Python

mm-repos / langgraph-claude-azure-mcp

Star

An intelligent Model Context Protocol (MCP) server for Azure AI Search integration with Claude Desktop - Transform enterprise document search into natural AI conversations using LangGraph workflows, Google Gemini, and advanced retrieval-augmented generation (RAG).

document-retrieval rag langgraph-python langsmith-tracing mcp-server claude-desktop-integration

Updated Sep 5, 2025
Python

xndien2004 / ViDrill

Star

[VLSP 2025] ViDRILL is a Vietnamese document retrieval system for VLSP 2025. It combines dense and sparse retrieval, reranking, and optional LLM-based query rewriting and reasoning to support high-accuracy information retrieval and future LLM-enhanced pipelines.

information-retrieval reinforcement-learning query-rewriting document-retrieval vietnamese-nlp reranking vlsp-2025

Updated Aug 16, 2025
Python

AadityaRajGupta / AetherCare_Platform

Star

AetherCare is an AI-powered healthcare platform that leverages Generative AI to assist users with medical inquiries, symptom-based disease prediction, hospital location services, and a knowledge repository for healthcare education. This project aims to enhance accessibility to healthcare information.

flask machine-learning transformers healthcare blog-article speech-to-text document-retrieval pinecone svc-model ai-chatbot hospital-finder generative-ai langchain-python

Updated Jul 29, 2025
HTML

RyanFabrick / UCSB-RAG-Chatbot

Star

AI-powered RAG (Retrieval Augmented Generation) chatbot for the UCSB College of Engineering with semantic search, source verification & comprehensive academic information retrieval. Built with Streamlit, Python, Google Gemini, ChromaDB, LangChain & custom web scraping pipeline using Puppeteer (Node.js/JavaScript).

Updated Jul 23, 2025
Python

PRITHIVSAKTHIUR / DREX

Star

drex-062225-exp (document retrieval and extraction expert) model is a specialized fine-tuned version of docscopeocr-7b-050425-exp, optimized for document retrieval, content extraction, and analysis recognition. built on top of the qwen2.5-vl architecture.

table documentation-tool gradio multimodality document-retrieval image-content-variation vlms image-text-to-text qwen2-5-vl

Updated Jul 18, 2025
Python

Rs306 / Team3_Final_Project

Star

A Streamlit-powered chatbot for querying PDF documents using RAG architecture with citation-based answers.

document-retrieval rag streamlit langchain

Updated Jul 16, 2025
Python

PRITHIVSAKTHIUR / Doc-VLMs-v2-Localization

Star

Doc-VLMs-v2-Localization is a demo app for the Camel-Doc-OCR-062825 model, fine-tuned from Qwen2.5-VL-7B-Instruct for advanced document retrieval, extraction, and analysis. It enhances document understanding and also integrates other notable Hugging Face models.

ocr table gradio document-retrieval ocr-recognition vision-language 7b huggingface-transformers vision-transformer qwen2-5-vl

Updated Jul 13, 2025
Python

PRITHIVSAKTHIUR / Doc-VLMs-exp

Star

An experimental document-focused Vision-Language Model application that provides advanced document analysis, text extraction, and multimodal understanding capabilities. This application features a streamlined Gradio interface for processing both images and videos using state-of-the-art vision-language models specialized in document understanding.

ocr transformers spaces demo-app gradio document-retrieval vgpu huggingface-transformers vlms drex qwen2-5-vl

Updated Jul 13, 2025
Python

rakeshmalik91 / musical-document-retrieval-icdar2013

Star

opencv ocr som ieee k-means document-retrieval icdar zernike

Updated Jul 13, 2025
C

Polo-Marco / AICUP-2023-MIG

Star

AICUP 2023 fact verification system using PERT-large and RoBERTa for document retrieval, evidence extraction, and claim validation. Ranked 4th on the private leaderboard (0.71007); built with Huggingface Transformers and TensorFlow.

chinese document-retrieval sentiment-classification

Updated Jun 18, 2025
Python

hithesh-mr / harry-potter-qna-with-cognee

Star

Exploring How Cognee's Knowledge Graphs Can Answer Questions About the Harry Potter Universe

text-mining document-retrieval rag agentic-ai agentic-ai-development cognee

Updated Jun 8, 2025
Jupyter Notebook

Improve this page

Add a description, image, and links to the document-retrieval topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the document-retrieval topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

document-retrieval

Here are 96 public repositories matching this topic...

chroma-core / chroma

OpenBMB / VisRAG

francescobrigante / Enterprise-RAG

ruoheng-du / healthcare-doc-analysis-rag-llm

PRITHIVSAKTHIUR / Multimodal-OCR2

spiliossp / Information-Retrieval

vearch / vearch

Md-Emon-Hasan / AutoDocThinker

LouisDo2108 / PromptDSI

mm-repos / langgraph-claude-azure-mcp

xndien2004 / ViDrill

AadityaRajGupta / AetherCare_Platform

RyanFabrick / UCSB-RAG-Chatbot

PRITHIVSAKTHIUR / DREX

Rs306 / Team3_Final_Project

PRITHIVSAKTHIUR / Doc-VLMs-v2-Localization

PRITHIVSAKTHIUR / Doc-VLMs-exp

rakeshmalik91 / musical-document-retrieval-icdar2013

Polo-Marco / AICUP-2023-MIG

hithesh-mr / harry-potter-qna-with-cognee

Improve this page

Add this topic to your repo