-
Harbin Institute of Technology
- Harbin
- https://huicongzhang.github.io
- @huicong_zhang
Highlights
- Pro
Lists (5)
Sort Name ascending (A-Z)
Stars
[DEIMv2] Real Time Object Detection Meets DINOv3
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)
Effortless data labeling with AI support from Segment Anything and other awesome models.
[NeurIPS 2025] YOLOv12: Attention-Centric Real-Time Object Detectors
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Ultra-high-performance, secure, all-in-one acceleration engine for developer resources
Repository for the code related to the paper "CARDIE:clustering algorithm on relevant descriptors for image enhancement"
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
Official PyTorch implementation of the Motion-adaptive Transformer for Event-based Image Deblurring (AAAI 2025).
Official PyTorch implementation of the Motion Aware Event Representation-driven Image Deblurring (ECCV 2024).
Motion Deblurring via Spatial-Temporal Collaboration of Frames and Events
Official repository for the ECCV 2024 paper, "CMTA: Cross-Modal Temporal Alignment for Event-guided Video Deblurring", ECCV 2024
The official implementation of "Compositional Generative Model of Unbounded 4D Cities". (TPAMI 2026)
[ICCV 2025] STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
A generative world for general-purpose robotics & embodied AI learning.
[IEEE TPAMI] Latency-aware Unified Dynamic Networks for Efficient Image Recognition
[ NeurIPS 2024 ] The official PyTorch implementation for Learning Truncated Causal History Model for Video Restoration.
A Latex template for journal review response (initially designed for IEEE TGRS)
Official Implementation for our NeurIPS 2024 paper, "Don't Look Twice: Run-Length Tokenization for Faster Video Transformers".
collection of diffusion model papers categorized by their subareas
A paper list of recent mamba efforts for low-level vision.
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
[ECCV 2024] Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration
A Survey of Embodied Learning for Object-Centric Robotic Manipulation
Lightweight Event-based Optical Flow Estimation via Iterative Deblurring