Borui Zhang (张博睿)

About me: I am a final-year Ph.D. candidate at the i-VisionGroup, Department of Automation, Tsinghua University, advised by Prof. Jiwen Lu. I received my B.E. degree from the Department of Automation and a second B.A. degree from the School of Economics and Management at Tsinghua University in 2021. My research philosophy focuses on bridging the gap between theoretical interpretability and practical efficiency in deep learning.

Research: My primary research interests lie at the intersection of Computer Vision and Deep Learning Theory. Currently, I am focused on the following pillars:

Explainable AI (XAI):
- Black-box XAI: Axiomatic interpretation & Neural visualization.
- White-box XAI: Concept alignment & White-box architecture design.
Neural Network Theory:
- Optimization: Efficient optimization strategies & Convergence analysis.
- Inductive Bias: Frequency bias exploration & Piece-wise linear modeling.
Large Multimodal Models:
- Visual tokenizers & Native multimodal architectures.
- Efficient VLM training/inference & GUI Agents.
- Safety alignment & Model interpretability.

Email / Google Scholar / Github / Xiaohongshu / CV

News

2024-07: 1 paper on autonomous driving accepted to ECCV 2024.

2024-04: 1 paper on salient object detection accepted to TGRS 2024.

2024-02: 2 papers on autonomous driving accepted to CVPR 2024.

2024-01: 1 paper on explainable deep networks accepted to ICLR 2024.

2023-01: 1 paper on explainable deep networks accepted to ICLR 2023.

2022-07: 1 paper on dynamic metric learning accepted to ECCV 2022.

2022-03: 1 paper on explainable metric learning accepted to CVPR 2022.

2021-07: 1 paper on deep metric learning accepted to ICCV 2021.

Preprints

* indicates equal contribution

Quantize-then-Rectify: Efficient VQ-VAE Training

Borui Zhang*, Qihang Rao*, Wenzhao Zheng, Jie Zhou, Jiwen Lu

arXiv, 2025

[Paper] / [Code] / [Project Page] / [HF Demo] / [中文解读]

To investigate the relationship between continuous and discrete tokenizers, we propose ReVQ. This method yields a high-performance VQ-VAE requiring only 40 GPU hours of training on a single RTX 4090.

SFTok: Bridging the Performance Gap in Discrete Tokenizers

Qihang Rao*, Borui Zhang*, Wenzhao Zheng, Jie Zhou, Jiwen Lu

arXiv, 2025

[Paper] / [Code] / [Project Page] / [HF Demo] / [中文解读]

We investigate iterative approaches for constructing discrete tokenizers and propose SFTok. Analogous to the discrete diffusion paradigm, SFTok is well-suited for integration into Multimodal Large Models (MLLMs), facilitating the realization of a unified discrete diffusion framework.

Preventing Local Pitfalls in Vector Quantization via Optimal Transport

Borui Zhang*, Wenzhao Zheng, Jie Zhou, Jiwen Lu

arXiv, 2024

[Paper] / [Code] / [Project Page] / [HF Demo] / [中文解读]

This study addresses the training instability of Vector-Quantized Networks (VQNs) by introducing OptVQ, a new method using the Sinkhorn algorithm for optimal transport. It achieves full codebook utilization (100%) and outperforms current VQNs.

Exploring Unified Perspective For Fast Shapley Value Estimation

Borui Zhang*, Baotong Tian*, Wenzhao Zheng, Jie Zhou, Jiwen Lu

arXiv, 2023

[Paper] / [Code]

This paper analyzes existing Shapley value estimators and proposes SimSHAP. Experiments validate that SimSHAP significantly accelerates the computation of accurate Shapley values.

Selected Publications

Path Choice Matters for Clear Attribution in Path Methods

Borui Zhang, Wenzhao Zheng, Jie Zhou, Jiwen Lu

ICLR 2024, 2024

[Paper] / [Code]

We introduced the Concentration Principle and developed SAMP, an efficient model-agnostic interpreter incorporating infinitesimal constraint (IC) and momentum strategy (MS).

Bort: Towards Explainable Neural Networks with Bounded Orthogonal Constraint

Borui Zhang, Wenzhao Zheng, Jie Zhou, Jiwen Lu

ICLR 2023, 2023

[Paper] / [Code]

This paper proposes Bort, an optimizer for improving model explainability with boundedness and orthogonality constraints, derived from model comprehensibility conditions.

Attributable Visual Similarity Learning

Borui Zhang, Wenzhao Zheng, Jie Zhou, Jiwen Lu

CVPR 2022, 2022

[Paper] / [Code]

We propose AVSL, which employs a generalized similarity learning paradigm to represent the similarity between images with a graph for a more accurate and explainable measure.

Deep Relational Metric Learning

Wenzhao Zheng*, Borui Zhang*, Jiwen Lu, Jie Zhou

ICCV 2021, 2021

[Paper] / [Code]

This paper proposes to adaptively learn an ensemble of features that characterizes an image from different aspects, employing a relational module to capture correlations among features.

Honors and Awards

2024 12·9 Student Counselor Award

2023 National Scholarship (Highest national honor for students)

2022 Tsinghua Outstanding Student Cadre Award

2022 Tsinghua Excellent Teaching Assistant Award

2021 Tsinghua Future Scholars Scholarship (Rank: 1/181)

2020 Changtong Scholarship (Highest dept. honor)

2019 Jiang Nanxiang Scholarship (Highest junior honor)

2018 National Scholarship

Borui Zhang (张博睿)

News

Preprints

Selected Publications

Honors and Awards

Academic Services