Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View IDKiro's full-sized avatar
  • Zhejiang University
  • Hangzhou

Block or report IDKiro

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

MiniMax M2.1, a SOTA model for real-world dev & agents.

166 10 Updated Dec 26, 2025

Pixio: a capable vision encoder dedicated to dense prediction, simply by pixel reconstruction

Python 268 8 Updated Dec 26, 2025

Towards Scalable Pre-training of Visual Tokenizers for Generation

Python 369 8 Updated Dec 16, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 66,279 12,221 Updated Dec 27, 2025

Official PyTorch Code for "OmniAID: Decoupling Semantic and Artifacts for Universal AI-Generated Image Detection in the Wild".

Python 12 Updated Dec 11, 2025

Semantics Lead the Way: Harmonizing Semantic and Texture Modeling with Asynchronous Latent Diffusion

Python 296 3 Updated Dec 21, 2025

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 2,145 135 Updated Dec 15, 2025

Paper Debugger is the best overleaf companion

TypeScript 1,180 56 Updated Dec 21, 2025

A minimal PyTorch re-implementation of Qwen3 VL with a fancy CLI

Python 294 17 Updated Dec 2, 2025
Python 8,018 470 Updated Dec 25, 2025

HunyuanVideo-1.5: A leading lightweight video generation model

Python 2,208 104 Updated Dec 25, 2025
Python 166 9 Updated Nov 26, 2025

PyTorch implementation of JiT https://arxiv.org/abs/2511.13720

Python 1,874 112 Updated Dec 8, 2025

Official Implementation of "MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation"

Python 281 7 Updated Nov 19, 2025
Python 780 67 Updated Dec 9, 2025

The NVIDIA® Tools Extension SDK (NVTX) is a C-based Application Programming Interface (API) for annotating events, code ranges, and resources in your applications.

C++ 491 66 Updated Dec 26, 2025

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 2,003 133 Updated Dec 18, 2025

Triton implementation of FlashAttention2 that adds Custom Masks.

Python 157 15 Updated Aug 14, 2024

[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models

Python 343 19 Updated May 27, 2024

Official implementation of paper "VMoBA: Mixture-of-Block Attention for Video Diffusion Models"

Python 58 3 Updated Jul 1, 2025

[NeurIPS 2025 Oral]Infinity⭐️: Unified Spacetime AutoRegressive Modeling for Visual Generation

Python 670 24 Updated Nov 27, 2025

Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the future state of the world in the form of video.

Python 548 48 Updated Dec 20, 2025

A professional cross-platform SSH/Sftp/Shell/Telnet/Tmux/Serial terminal.

C 29,168 2,251 Updated Mar 11, 2025

Cyberduck is a libre FTP, SFTP, WebDAV, Amazon S3, Backblaze B2, Microsoft Azure & OneDrive and OpenStack Swift file transfer client for Mac and Windows.

Java 4,146 326 Updated Dec 26, 2025

MiniMax-M2, a model built for Max coding & agentic workflows.

2,150 164 Updated Nov 13, 2025

On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)

Python 776 55 Updated Jul 5, 2025

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 6,553 369 Updated Dec 24, 2025

NipaPlay-Reload 是一个现代化的跨平台本地视频播放器,支持 Windows、macOS、Linux、Android 和 iOS。集成了弹幕显示、多格式字幕支持、多音频轨道切换,新番查看等功能,支持挂载Emby/Jellyfin媒体库。采用 Flutter 开发,提供统一的用户体验。

Dart 1,083 48 Updated Dec 20, 2025
124 3 Updated Dec 8, 2025

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,656 55 Updated Dec 26, 2025
Next