Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View johnking0099's full-sized avatar
  • IBM CDL (currently)
  • BJ, China

Organizations

@opendatalab

Block or report johnking0099

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

a go daemon that syncs MongoDB to Elasticsearch in realtime. you know, for search.

Go 1,331 193 Updated Aug 22, 2025

MinerU-HTML: An SLM-powered HTML main content extractor that outputs clean HTML bodies. Perfect for Deep Research Agents, RAG applications, and training data generation.

HTML 202 23 Updated Dec 25, 2025

A Python package for interacting with the MinerU Vision-Language Model.

Python 103 27 Updated Feb 5, 2026

VS Code in the browser

TypeScript 76,207 6,509 Updated Feb 13, 2026

Data browser based on s3. 一个基于 S3 的数据(json / jsonl / parquet / html / md等)可视化工具。👇 Try online.

TypeScript 79 12 Updated Nov 11, 2025

🤖 The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, dif…

Go 42,770 3,550 Updated Feb 13, 2026

Get up and running with Kimi-K2.5, GLM-5, MiniMax, DeepSeek, gpt-oss, Qwen, Gemma and other models.

Go 162,515 14,569 Updated Feb 12, 2026
Python 14 Updated May 16, 2025

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 23,494 4,433 Updated Feb 13, 2026

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 41,247 7,216 Updated Feb 13, 2026

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.

Python 52,147 4,314 Updated Feb 12, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 70,231 13,429 Updated Feb 13, 2026

LLM inference in C/C++

C++ 94,958 14,888 Updated Feb 13, 2026

Production-ready platform for agentic workflow development.

TypeScript 129,532 20,151 Updated Feb 13, 2026
Python 9 Updated Aug 20, 2025

UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition

Python 455 38 Updated Sep 28, 2025

[ICLR 2025 Spotlight] The official implementation of the paper “LOKI:A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models”

Python 175 4 Updated Feb 7, 2026

万卷1.0多模态语料

569 28 Updated Oct 20, 2023

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Python 9,358 700 Updated Jan 3, 2025

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

Python 54,325 4,512 Updated Feb 9, 2026

[ICCV 2025] The official implementation of the paper “Street-to-Satellite Image Synthesis with Diffusion Models and BEV Paradigm”

Python 82 6 Updated Oct 17, 2025

WanJuan-CC是以CommonCrawl为基础,经过数据抽取,规则清洗,去重,安全过滤,质量清洗等步骤得到的高质量数据。

14 Updated Apr 18, 2024

[ECCV 2024 Best Paper Candidate & TPAMI 2025] PointLLM: Empowering Large Language Models to Understand Point Clouds

Python 973 51 Updated Aug 14, 2025

LabelU front-end library

TypeScript 9 4 Updated Feb 28, 2023

Data annotation toolbox supports image, audio and video data.

Python 1,495 158 Updated Oct 1, 2025

[AAAI 2023] Official PyTorch implementation of paper "ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency".

Python 238 12 Updated Dec 7, 2022

Data annotation component library --provided as NPM packages

TypeScript 146 48 Updated Nov 19, 2025

datasets resource

130 15 Updated Jul 1, 2025

Data Set Description Language Specification (新一代人工智能数据集描述语言DSDL)

HTML 47 6 Updated May 29, 2024
Next