|
Sihan Yang
Hi there! I am a senior student at the University of Electronic Science and Technology of China. I will join The Chinese University of Hong Kong as a PhD student in Fall 2026. Previously, I have spent wonderful time at Shanghai AI Laboratory.
I'm interested in internship or collaboration opportunities on foundation model architectures, especially linear sequence modeling. If you have any such opportunities, please feel free to contact me.
Email /
CV /
Scholar /
Github
|
Photo credit to my homie Taoran
|
Research
I'm interested in network architecture for foundation models, efficient deep learning and machine learning system. I aspire to become an algo & mlsys co-designer. Some papers are highlighted.
|
*Equal Contribution
‡Project Lead
†Corresponding Author
MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence
Jingli Lin*, Runsen Xu*‡, Shaohao Zhu, Sihan Yang, Peizhou Cao, Yunlong Ran, Miao Hu, Chenming Zhu, Yiman Xie, Yilin Long, Wenbo Hu, Dahua Lin, Tai Wang†, Jiangmiao Pang†
arXiv
Homepage |
Dataset |
Paper |
arXiv |
Code
|
MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence
Sihan Yang*,
Runsen Xu*‡,
Yiman Xie,
Sizhe Yang,
Mo Li,
Jingli Lin,
Chenming Zhu,
Xiaochen Chen,
Haodong Duan,
Xiangyu Yue,
Dahua Lin,
Tai Wang†,
Jiangmiao Pang†
arXiv
Homepage |
Dataset |
Paper |
arXiv
We introduce a challenging, diverse, and comprehensive multi-image spatial reasoning benchmark, manually annotated by six 3D vision experts, which additionally supports thorough evaluation of reasoning processes.
|
VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization
Sihan Yang,
Runsen Xu,
Chenhang Cui,
Tai Wang,
Dahua Lin,
Jiangmiao Pang
ICCV 2025
Paper |
arXiv |
Code
We propose a visual token pruning framework that designs optimal, model-specific pruning strategies for different MLLMs.
|
Improving Alignment in LVLMs with Debiased Self-Judgment
Sihan Yang*,
Chenhang Cui*,
Zihao Zhao,
Yiyang Zhou,
Weilong Yan,
Ying Wei,
Huaxiu Yao
EMNLP 2025 Findings
Paper |
arXiv |
Dataset |
Code
|
Calibrated Self-rewarding Vision Language Models
Yiyang Zhou*, Zhiyuan Fan*, Dongjie Cheng*, Sihan Yang, Zhaorun Chen, Chenhang Cui, Xiyao Wang, Yun Li, Linjun Zhang, Huaxiu Yao
NeurIPS 2024
Paper |
arXiv |
Code
|
|
Honors and Awards
|
SenseTime Scholarship (awarded annually to 30 UGs in the field of AI from across China)
Tencent Scholarship (sole recipient in the School of Software Engineering, UESTC; 1/718)
The Most Outstanding Students Award of UESTC (top 10 at UESTC)
Scholarship in Honor of Modern Scientists (top 10 at UESTC)
National Scholarship for 2024/2025 Academic Year
National Scholarship for 2023/2024 Academic Year
National Scholarship for 2022/2023 Academic Year
|
|
Academic Service
|
Reviewer, ICLR 2026
Reviewer, CVPR 2026
|
|