Thanks to visit codestin.com
Credit goes to github.com

studentfromChina

Follow

Ronniejiang studentfromChina

Follow

2 followers · 10 following

Stars

SkyworkAI / SkyReels-V2

SkyReels-V2: Infinite-length Film Generative model

Python 4,883 690 Updated Aug 11, 2025

modelscope / DiffSynth-Studio

Enjoy the magic of Diffusion models!

Python 10,557 986 Updated Nov 4, 2025

Eyeline-Labs / Go-with-the-Flow

The official implementation of CVPR'25 Oral paper "Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise"

Python 1,036 48 Updated Oct 13, 2025

overleaf / overleaf

A web-based collaborative LaTeX editor

JavaScript 16,787 1,764 Updated Nov 4, 2025

MiliLab / GeoLLaVA-8K

Official repo for [NeurlPS 2025 Spotlight] "GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution"

Python 31 1 Updated Oct 27, 2025

MTLab / PE-Field

Python 208 5 Updated Oct 21, 2025

VisionXLab / LRS-VQA

[ICCV'25] When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning

Python 38 1 Updated Aug 1, 2025

dvlab-research / DreamOmni2

This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing and Generation''

Python 2,370 202 Updated Oct 20, 2025

kang-wu / SkySensePlusPlus

[Nature Machine Intelligence 2025] This repository is the official implementation of the paper "A semantic-enhanced multi-modal remote sensing foundation model for Earth observation".

Python 127 3 Updated Sep 18, 2025

snap-research / ac3d

AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers

Python 137 11 Updated Sep 16, 2025

HaoZhang1018 / OmniFuse

Code of Paper OmniFuse: Composite Degradation-Robust Image Fusion with Language-Driven Semantics.

Jupyter Notebook 23 1 Updated Sep 16, 2025

Leiii-Cao / Text-DiFuse

This is the official code of the NeurIPS 2024 paper "Text-DiFuse: An Interactive Multi-Modal Image Fusion Framework based on Text-modulated Diffusion Model"

Jupyter Notebook 30 1 Updated Jun 18, 2025

XunpengYi / Text-IF

Official Code of Text-IF: Leveraging Semantic Text Guidance for Degradation-Aware and Interactive Image Fusion (CVPR2024)

Python 108 7 Updated Apr 21, 2025

HKUDS / VideoRAG

"VideoRAG: Chat with Your Videos"

Python 1,255 181 Updated Oct 22, 2025

showlab / Awesome-Video-Diffusion

A curated list of recent diffusion models for video generation, editing, and various other applications.

5,164 318 Updated Oct 15, 2025

Open-Book-Studio / THU-Coursework-Machine-Learning-for-Big-Data

这是本人学习清华大学70240403-200大数据机器学习课程的开源工作，包括对往期Assignment的实现、对Lecture的笔记与理解、对即将来的Project的实现等，欢迎各位同学一起学习一起讨论，对知识取得更好的理解。可在线阅读文档：https://thu-coursework-machine-learning-for-big-data-docs.vercel.app/

Jupyter Notebook 8 2 Updated Dec 17, 2024

earth-insights / SegEarth-R1

SegEarth-R1: Geospatial Pixel Reasoning via Large Language Model

Python 123 7 Updated Aug 27, 2025

dvlab-research / Seg-Zero

Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"

Python 543 25 Updated Jul 30, 2025

sii-research / VCCL

Venus Collective Communication Library, supported by SII and Infrawaves.

C++ 108 4 Updated Nov 3, 2025

SijuMa2003 / RIS-FUSION

Python 9 Updated Sep 19, 2025

thunderbolt215 / ArtiMuse

ArtiMuse: Fine-Grained Image Aesthetics Assessment with Joint Scoring and Expert-Level Understanding（书生 · 妙析多模态美学理解大模型）

Python 77 3 Updated Oct 16, 2025

jungao1106 / ICoT

[CVPR' 25] Interleaved-Modal Chain-of-Thought

Python 90 4 Updated Oct 23, 2025

SYuan03 / MM-IFEngine

[ICCV 2025] MM-IFEngine: Towards Multimodal Instruction Following

Python 109 Updated Sep 16, 2025

dvlab-research / VisionReasoner

Vision Manus: Your versatile Visual AI assistant

Python 290 15 Updated Oct 12, 2025

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,243 455 Updated Oct 27, 2025

Infrawaves / DeepEP_ibrc_dual-ports_multiQP

Aims to implement dual-port and multi-qp solutions in deepEP ibrc transport

Cuda 66 3 Updated May 9, 2025

UnrealZoo / unrealzoo-gym

Forked from zfw1226/gym-unrealcv

[ICCV 2025 Highlights] Large-scale photo-realistic virtual worlds for embodied AI

Python 205 11 Updated Oct 14, 2025

Norman-Ou / GeoPix

[GRSM] Project Page for "GeoPix: Multi-Modal Large Language Model for Pixel-level Image Understanding in Remote Sensing"

Python 55 5 Updated May 10, 2025

PKU-YuanGroup / MoE-LLaVA

【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models

Python 2,266 140 Updated Jul 15, 2025

nonwhy / PURE

[ICCV2025] PyTorch implementation of "Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models"

Python 107 4 Updated Jul 25, 2025