Lists (1)
Sort Name ascending (A-Z)
Starred repositories
An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation
Production-ready platform for agentic workflow development.
Dataset of PNG images from synthetically generated table layouts with annotations in JSONL files
A large scale camera-taken table detection and recognition dataset.
The supplement of ICDAR2019 cTDaR's modern subset in terms of adjacency relation
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
Table structure recognition dataset of the paper: Complicated Table Structure Recognition
A Curated List of Awesome Table Structure Recognition (TSR) Research. Including models, papers, datasets and codes. Continuously updating.
该项目是为了使用layoutlmv3针对中文图片训练和推理。 其中主要解决三个问题: 1.数据标准化成可以的训练数据集格式 2.layoutlmv3-base-chinese 分词修改 2.超过512长度的文本切分和滑窗操作
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
A Unified Toolkit for Deep Learning Based Document Image Analysis
OCR, layout analysis, reading order, table recognition in 90+ languages
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
超轻量级中文ocr,支持竖排文字识别, 支持ncnn、mnn、tnn推理 ( dbnet(1.8M) + crnn(2.5M) + anglenet(378KB)) 总模型仅4.7M
Tesseract Open Source OCR Engine (main repository)
A lightweight LMM-based Document Parsing Model
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).
Official inference repo for FLUX.1 models
[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step