-
Tongji Univ
- Shanghai, China
-
21:45
(UTC +08:00) - [email protected]
Highlights
- Pro
Stars
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
基于多智能体LLM的中文金融交易框架 - TradingAgents中文增强版
[NeurIPS' 2025] JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent
[CVPR 2025] The code for paper ''Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding''.
[CVPR'2025] VoCo-LLaMA: This repo is the official implementation of "VoCo-LLaMA: Towards Vision Compression with Large Language Models".
Writing AI Conference Papers: A Handbook for Beginners
[NeurIPS 2024] A Generalizable World Model for Autonomous Driving
SEED-Story: Multimodal Long Story Generation with Large Language Model
[NeurIPS 2024] Boosting the performance of consistency models with PCM!
Official PyTorch implementation of 3D Gaussian Mapping (3DGM)
OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving
[CVPR 2024 Award Candidate] Producing and Leveraging Online Map Uncertainty in Trajectory Prediction
[CVPR 2024 Oral, Best Paper Award Candidate] Official repository of "PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness"
[CVPR 2024 Best paper award candidate] EGTR: Extracting Graph from Transformer for Scene Graph Generation
Official implementation of "Towards Efficient Visual Adaption via Structural Re-parameterization".
Code for 3D-LLM: Injecting the 3D World into Large Language Models
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
[CVPR 2024] Official implementation of "Towards Realistic Scene Generation with LiDAR Diffusion Models"
a state-of-the-art-level open visual language model | 多模态预训练模型
A curated list of awesome LLM/VLM/VLA for Autonomous Driving(LLM4AD) resources (continually updated)
llama3 implementation one matrix multiplication at a time
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
[CVPR 2024] Official Implementation of Learning to Remove Wrinkled Transparent Film with Polarized Prior