Stars
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
A lightweight LMM-based Document Parsing Model
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Your self-hosted, globally interconnected microblogging community
Apache Pulsar - distributed pub-sub messaging system
Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
3D playground built on three.js and cannon.js.
The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。
Lynxmotion Phoenix Program for Arduino split into libraries
Benchmarks of approximate nearest neighbor libraries in Python
A library for efficient similarity search and clustering of dense vectors.
A Dead Simple BERT API for Python and Java (https://github.com/google-research/bert)
The pytorch re-implement of the official efficientdet with SOTA performance in real time and pretrained weights.
BigDL: Distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray
KubeOperator 是一个开源的轻量级 Kubernetes 发行版,专注于帮助企业规划、部署和运营生产级别的 K8s 集群。
TuShare is a utility for crawling historical data of China stocks
TuShare是实现对股票/期货等金融数据从数据采集、清洗加工 到 数据存储过程的工具,满足金融量化分析师和学习数据分析的人在数据获取方面的需求,它的特点是数据覆盖范围广,接口调用简单,响应快速。
Information Protection & OSINT resources | 一个关于数字隐私搜集、保护、清理集一体的方案,外加开源信息收集(OSINT)对抗
A command-line installer for Windows.
An Open Source Machine Learning Framework for Everyone
Repo for counting stars and contributing. Press F to pay respect to glorious developers.