Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View xingling0's full-sized avatar

Block or report xingling0

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[NeurIPS 2025] The official repository of "Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning"

Python 39 Updated Feb 20, 2025

Official PyTorch implementation for "Large Language Diffusion Models"

Python 3,415 230 Updated Nov 12, 2025

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 24,199 2,686 Updated Aug 12, 2024

[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

Python 335 21 Updated Jul 17, 2024

A paper list of Awesome Latent Space.

248 10 Updated Dec 22, 2025

The official repository for the paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"

Jupyter Notebook 125 3 Updated Dec 22, 2025

Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"

Python 225 11 Updated Dec 9, 2025

Video-CoM: Interactive Video Reasoning via Chain of Manipulations

16 Updated Dec 1, 2025

Glance: Accelerating Diffusion Models with 1 Sample

Python 135 1 Updated Dec 15, 2025

This is a collection of recent papers on reasoning in video generation models.

86 1 Updated Dec 15, 2025
Jupyter Notebook 3 Updated Oct 24, 2025

Strong and Open Vision Language Assistant for Mobile Devices

Python 1,315 85 Updated Apr 15, 2024

Official implementation of Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning

Python 27 3 Updated Oct 30, 2024

Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"

Python 525 50 Updated Nov 4, 2025

Contexts Optical Compression

Python 21,527 1,926 Updated Oct 25, 2025

Code for paper "Towards Better & Faster Autoregressive Image Generation: From the Perspective of Entropy" [NeurIPS 2025] .

Python 12 Updated Dec 6, 2025

Code for the Molmo Vision-Language Model

Python 839 80 Updated Dec 12, 2024

SIGMORPHON 2022 Shared Task on Morpheme Segmentation

Jupyter Notebook 30 13 Updated Mar 26, 2023

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 64,312 7,793 Updated Dec 21, 2025

[Lumina Embodied AI] 具身智能技术指南 Embodied-AI-Guide

10,101 681 Updated Dec 3, 2025

[EMNLP-Findings'24] Tokenization Falling Short: On Subword Robustness in Large Language Models

Python 9 Updated Mar 7, 2025

MuLan: Adapting Multilingual Diffusion Models for 110+ Languages (无需额外训练为任意扩散模型支持多语言能力)

Python 144 3 Updated Jan 24, 2025

State-of-the-art LLM-based translation models.

Ruby 570 45 Updated Apr 9, 2025
Jupyter Notebook 27 Updated Mar 9, 2025

OpenICL is an open-source framework to facilitate research, development, and prototyping of in-context learning.

Python 583 30 Updated Oct 3, 2023
Python 35 1 Updated Jun 15, 2023

Facebook Low Resource (FLoRes) MT Benchmark

Python 757 133 Updated Nov 20, 2023
Next