Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Examples and tutorials to help developers build AI systems
Python library for Agentic Document Extraction from LandingAI
An introduction to PyTest with lots of simple, hackable examples
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data…
Production-ready platform for agentic workflow development.
borb is a library for reading, creating and manipulating PDF files in python.
Python bindings to PDFium, reasonably cross-platform.
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
Simple package to extract text with coordinates from programmatic PDFs
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
Get your documents ready for gen AI
An Improved Langchain RAG Tutorial (v2) with local LLMs, database updates, and testing.
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
Convert PDF to markdown + JSON quickly with high accuracy
[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation
Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition using Pytorch
End to end solution for migrating CSV data into a Neo4j graph using an LLM for the data discovery and graph data modeling stages.
GraphRAG: Knowledge in Graphs not Documents
🌄 Open Source AI & Data Landscape - provides overview of top tier projects in the open source AI and Data ecosystem, shows projects through GitHub data, funding or market cap, first and last commit…
A modular graph-based Retrieval-Augmented Generation (RAG) system
OnnxTR a docTR (Document Text Recognition) library Onnx pipeline wrapper - for seamless, high-performing & accessible OCR