Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View zhixin612's full-sized avatar
  • Tianjin University
  • Tianjin, China
  • 09:42 (UTC +08:00)

Highlights

  • Pro

Organizations

@TJU-NSL

Block or report zhixin612

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
zhixin612/README.md

I am a Ph.D. student in TANKLAB at Tianjin University, advised by Wenyu Qu and Yitao Hu. My research interests include Machine Learning Systems, LLM inference serving, and Distributed Systems. I received my B.S. degree in computer science from Northwest A&F University.

GitHub: zhixin612

Email: [email protected]


📑 Publications

  1. SLOpt: Serving Real-Time Inference Pipeline with Strict Latency Constraint
    Zhixin Zhao, Yitao Hu*, Guotao Yang, Ziqi Gong, Chen Shen, Laiping Zhao, Wenxin Li, Xiulong Liu, and Wenyu Qu.
    IEEE Transactions on Computers (TC), 2025.

  2. Harpagon: Minimizing DNN Serving Cost via Efficient Dispatching, Scheduling and Splitting
    Zhixin Zhao, Yitao Hu*, Ziqi Gong, Guotao Yang, Wenxin Li, Xiulong Liu, Keqiu Li, and Hao Wang.
    IEEE International Conference on Computer Communications (INFOCOM), 2025.

  3. TightLLM: Maximizing Throughput for LLM Inference via Adaptive Offloading Policy
    Yitao Hu, Xiulong Liu*, Guotao Yang, Linxuan Li, Kai Zeng, Zhixin Zhao, Sheng Chen, Laiping Zhao, Wenxin Li, and Keqiu Li.
    IEEE Transactions on Computers (TC), 2025.

  4. SuperSpec: Enhanced Verification and Sampling for End-to-End LLM Speculative Decoding
    Chen Shen, Rui Guo, Yang Cheng, Yang Lin, Zhixin Zhao, Yitao Hu*, Sheng Chen, Xiulong Liu, and Keqiu Li.
    IEEE International Conference on High Performance Computing and Communications (HPCC), 2025.

  5. SmartCache: Two-Dimensional KV-Cache Similarity for Efficient Long-Context LLM Decoding
    Chen Shen, Hao Chen, Kaining Hui, Zhixin Zhao, Yang Cheng, Yitao Hu*, Sheng Chen, Xiulong Liu, and Keqiu Li.
    IEEE International Conference on High Performance Computing and Communications (HPCC), 2025.

  6. High-throughput Sampling, Communicating and Training for Reinforcement Learning Systems
    Laiping Zhao, Xinan Dai, Zhixin Zhao, Yusong Xin, Yitao Hu*, Jun Qian, Jun Yao, and Keqiu Li.
    IEEE/ACM International Symposium on Quality of Service (IWQoS), 2023.


👨‍🎓 Academic Services

  • 2023: ICA3PP Program Committee member

⭐ Main Awards

  • 2021: The 2021 ICPC Shaanxi National Invitational: Silver Medal
  • 2020: The 2020 ICPC Asia-East Continent Final: Bronze Medal
  • 2020: The 45th ICPC Asia Regional Contest Shanghai Site: Silver Medal

🎓 Honors

  • 2024: Academic Scholarships, Tianjin University
  • 2023: Distinguished Academic Scholarship, Tianjin University
  • 2022: Outstanding Graduate, Northwest A&F University
  • 2021: Presidential Scholarship, Northwest A&F University
  • 2020: National Encouragement Scholarship, Northwest A&F University
  • 2019: National Encouragement Scholarship, Northwest A&F University

🏃‍♂️ Hobbies

photography📸 ping-pong🏓 badminton🏸 ...

Popular repositories Loading

  1. papernotes-scheduling papernotes-scheduling Public

    Summaries and notes on GPU Scheduling research papers

    3

  2. LLaMA-Factory LLaMA-Factory Public

    Forked from hiyouga/LLaMA-Factory

    Unify Efficient Fine-tuning of 100+ LLMs

    Python 1

  3. intel-cmt-cat intel-cmt-cat Public

    Forked from intel/intel-cmt-cat

    User space software for Intel(R) Resource Director Technology

    C

  4. ray ray Public

    Forked from ray-project/ray

    An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyp…

    Python

  5. seed_rl seed_rl Public

    Forked from google-research/seed_rl

    SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture.

    Python

  6. xingtian-project xingtian-project Public

    Forked from huawei-noah/xingtian

    xingtian is a componentized library for the development and verification of reinforcement learning algorithms

    Python