-
19:51
(UTC +08:00)
Lists (5)
Sort Name ascending (A-Z)
Stars
🦜🔗 Build context-aware reasoning applications
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
The official Python library for the OpenAI API
A Comprehensive Benchmark for Robust Multi-image Understanding
Benchmarking Multi-Image Understanding in Vision and Language Models
[arXiv 2025] MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding(书生 · 妙析多模态美学理解大模型)
A high-throughput and memory-efficient inference and serving engine for LLMs
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…
In-depth tutorials on LLMs, RAGs and real-world AI agent applications.
LLMs-from-scratch项目中文翻译
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
Search Airbnb using your AI Agent
An AI agent development platform with all-in-one visual tools, simplifying agent creation, debugging, and deployment like never before. Coze your way to AI Agent creation.
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Open Source DeepWiki: AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories. Join the discord: https://discord.gg/gMwThUMeme
DEYOLO: Dual-Feature-Enhancement YOLO for Cross-Modality Object Detection
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Python tool for converting files and office documents to Markdown.
Create Live Photos from a photo+video pair compatible with Apple Photos
Visualize YOLOv11 classification features with t-SNE. Extract logits or softmax outputs and create customizable scatter plots.
[CVPR 2025] EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!
AIGCPanel 是一个简单易用的一站式AI数字人系统,支持视频合成、声音合成、声音克隆,简化本地模型管理、一键导入和使用AI模型。