Thanks to visit codestin.com
Credit goes to www.huaxiuyao.io

  • About me

    Short Bio. I am a tenure-track Assistant Professor in the Department of Computer Science at the University of North Carolina at Chapel Hill, with a joint appointment in the School of Data Science and Society and an adjunct appointment in the Department of Biostatistics. My lab, AIMING, studies adaptive intelligence through alignment, interaction and learning, and is affiliated with the UNC NLP group. I was a Postdoctoral Scholar at Stanford University hosted by Chelsea Finn. I received my Ph.D. degree in 2021 at Pennsylvania State University under the advisory of Zhenhui (Jessie) Li. During my Ph.D, I also spent time visiting CMU hosted by Eric P. Xing.

    Lab Openings:

    - We will recruit 3 Ph.D. students for Fall 2026 and multiple interns or visiting students all year. Please read THIS for detailed recruitment information.

    Research Interests. My research focuses on both the theoretical and applied aspects of building generalizable and adaptive agentic foundation models. Additionally, I am keen on utilizing these models to facilitate applications in biomedicine/healthcare and robotics. Currently, my primary endeavors revolve around the following key directions:

    1. Enabling continual adaptability and evolvability of AI agents in unseen environments and tasks.
    2. Exploring effective strategies to fine-tune foundation agents for enhanced generalization, reasoning, and alignment.
    3. Pioneering interactive systems that seamlessly integrate humans, foundation models, and skills to achieve complex goals.

    You can follow me on Twitter at @HuaxiuYaoML or 小红书 at Huaxiu Yao.

    News and Travel

    [2025-2026 Service] Senior Area Chair in ACL 2025, EMNLP 2025; Area Chair in ICML 2025, NeurIPS 2025, ICLR 2025, AISTATS 2025; Action Editor: TMLR

    [2026.01] Six papers were accepted by ICLR 2026, two papers were accepted by MLSys 2026 and ICRA 2026, respectively.

    [2025.09] Four papers were accepted by EMNLP 2025 (two main track, two findings)

    [2025.07] Two papers were accepted by COLM 2025

    [2025.05] Three papers were accepted by ICML 2025, four papers were accepted by ACL 2025 (two main track, two findings)

    [2025.01] Six papers were accepted by ICLR 2025, two papers were accepted by findings of NAACL 2025, and one paper was accepted by ICRA

    [2024.09] Five papers were accepted by NeurIPS 2024 (three main track, two D&B track), One paper was accepted by EMNLP 2024

    Awards

    • Cisco Faculty Research Award, 2026
    • Amazon Research Awards, 2025
    • UNC Junior Faculty Development Award, 2025
    • PharmAlliance Early Career Researcher Award, 2025
    • KDD Health Day Distinguished Vision Award, 2025
    • TMLR Outstanding Paper Award, 2024
    • KDD Best Paper Award, 2024
    • Cisco Faculty Research Award, 2024
    • National AI Research Resource Pilot Award, 2024
    • Creativity Hubs Seed-funding Winner, 2024
    • NC TraCS Pilot Award, 2024
    • AAAI New Faculty Highlights, 2024
    • AI2000 Most Influential Scholar Award Honorable Mention, 2022
    • AI Rising Stars in Chinese Students, Baidu Research, 2021
    • College of IST Ph.D. Award for Research Excellence, Penn State University, 2020
  • Selected Recent publicationS

    Please see the complete list in Google Scholar.

    The underline (co-)first authors are students mentored by me; : equal advising

    Foundation Model Algorithms & Evaluation

    [1] Peng Xia*, Jianwen Chen*, Hanyang Wang*, Jiaqi Liu, Kaide Zeng, Yu Wang, Siwei Han, Yiyang Zhou, Xujiang Zhao, Haifeng Chen, Zeyu Zheng, Cihang Xie, Huaxiu Yao, SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning, arXiv 2602.08234.

    [AI Agent]

    [2] Jiaqi Liu*, Yaofeng Su*, Peng Xia, Siwei Han, Zeyu Zheng, Cihang Xie, Mingyu Ding, Huaxiu Yao, SimpleMem: Efficient Lifelong Memory for LLM Agents, arXiv 2601.02553.

    [AI Agent]

    [3] Peng Xia, Kaide Zeng, Jiaqi Liu, Can Qin, Fang Wu, Yiyang Zhou, Caiming Xiong, Huaxiu Yao, Agent0: Unleashing self-evolving agents from zero data via tool-integrated reasoning, arXiv 2511.16043.

    [AI Agent]

    [4] Zhaoyang Wang, Canwen Xu, Boyi Liu, Yite Wang, Siwei Han, Zhewei Yao, Huaxiu Yao, Yuxiong He, Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning, arXiv 2602.10090.

    [AI Agent]

    [5] Siwei Han, Kaiwen Xiong, Jiaqi Liu, Xinyu Ye, Yaofeng Su, Wenbo Duan, Xinyuan Liu, Cihang Xie, Mohit Bansal, Mingyu Ding, Linjun Zhang, Huaxiu Yao, Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails, arXiv 2510.04860.

    [AI Agent]

    [6] Yiyang Zhou*, Haoqin Tu*, Zijun Wang, Zeyu Wang, Niklas Muennighoff, Fan Nie, Yejin Choi, James Zou, Chaorui Deng, Shen Yan, Haoqi Fan, Cihang Xie, Huaxiu Yao, Qinghao Ye, When visualizing is the first step to reasoning: Mira, a benchmark for visual chain-of-thought, arXiv:2511.02779.

    [VLM/VLA] [Benchmark & Evaluation]

    [7] Fan Nie, Ken Ziyu Liu, Zihao Wang, Rui Sun, Wei Liu, Weijia Shi, Huaxiu Yao, Linjun Zhang, Andrew Y Ng, James Zou, Sanmi Koyejo, Yejin Choi, Percy Liang, Niklas Muennighoff, Uq: Assessing language models on unsolved questions, arXiv:2508.17580.

    [Benchmark & Evaluation]

    [8] Zijian Zhang*, Kaiyuan Zheng*, Zhaorun Chen*, Joel Jang, Yi Li, Siwei Han, Chaoqi Wang, Mingyu Ding, Dieter Fox, Huaxiu Yao, Grape: Generalizing robot policy via preference alignment, in Proceeding of 2026 IEEE International Conference on Robotics and Automation (ICRA 2026), Vienna, Austria, Jun 2026.

    [VLM/VLA]

    [9] Xinyu Geng*, Peng Xia*, Zhen Zhang, Xinyu Wang, Qiuchen Wang, Ruixue Ding, Chenxi Wang, Jialong Wu, Yida Zhao, Kuan Li, Yong Jiang, Pengjun Xie, Fei Huang, Huaxiu Yao, Yi R. Feng, Jingren Zhou, Webwatcher: Breaking new frontier of vision-language deep research agent, in Proceeding of the 14th International Conference on Learning Representations (ICLR 2026), Rio de Janeiro, Brazil, Apr 2026.

    [AI Agent] [VLM/VLA]

    [10] Yiyang Zhou*, Yangfan He*, Yaofeng Su, Siwei Han, Joel Jang, Gedas Bertasius, Mohit Bansal, Huaxiu Yao, ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding, in Proceeding of the Thirty-Ninth Conference on Neural Information Processing Systems (NeurIPS 2025), San Diego, CA, Dec 2025.

    [AI Agent] [VLM/VLA]

    [11] Haibo Tong*, Zhaoyang Wang*, Zhaorun Chen, Haonian Ji, Shi Qiu, Siwei Han, Kexin Geng, Zhongkai Xue, Yiyang Zhou, Peng Xia, Mingyu Ding, Rafael Rafailov, Chelsea Finn, Huaxiu Yao, MJ-VIDEO: Fine-Grained Benchmarking and Rewarding Video Preferences in Video Generation, in Proceeding of the Thirty-Ninth Conference on Neural Information Processing Systems (NeurIPS 2025, Spotlight), San Diego, CA, Dec 2025.

    [VLM/VLA] [Benchmark & Evaluation]

    [12] Zhaorun Chen*, Yichao Du*, Zichen Wen*, Yiyang Zhou*, Chenhang Cui, Zhenzhen Weng, Haoqin Tu, Chaoqi Wang, Zhengwei Tong, Qinglan Huang, Canyu Chen, Qinghao Ye, Zhihong Zhu, Yuqing Zhang, Jiawei Zhou, Zhuokai Zhao, Rafael Rafailov, Chelsea Finn, Huaxiu Yao, MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?, in Proceeding of the Thirty-Ninth Conference on Neural Information Processing Systems Track on Datasets & Benchmarks (NeurIPS 2025), San Diego, CA, Dec 2025.

    [VLM/VLA] [Benchmark & Evaluation]

    [13] Yiyang Zhou*, Zhaoyang Wang*, Tianle Wang*, Shangyu Xing, Peng Xia, Bo Li, Kaiyuan Zheng, Zijian Zhang, Zhaorun Chen, Wenhao Zheng, Xuchao Zhang, Chetan Bansal, Weitong Zhang, Ying Wei, Mohit Bansal, Huaxiu Yao, AnyPrefer: An Automatic Framework for Preference Data Synthesis, in Proceeding of the 13th International Conference on Learning Representations (ICLR 2025), Singapore, Apr 2025.

    [AI Agent] [VLM/VLA] [Benchmark & Evaluation]

    [14] Yiyang Zhou*, Zhiyuan Fan*, Dongjie Cheng*, Sihan Yang, Zhaorun Chen, Chenhang Cui, Xiyao Wang, Yun Li, Linjun Zhang, Huaxiu Yao, Calibrated Self-Rewarding Vision Language Models, in Proceeding of the Thirty-Eighth Conference on Neural Information Processing Systems (NeurIPS 2024), Vancouver, Canada, Dec 2024.

    [VLM/VLA]

    [15] Yiyang Zhou*, Chenhang Cui*, Rafael Rafailov, Chelsea Finn, Huaxiu Yao, Aligning modalities in vision large language models via preference fine-tuning, arXiv:2402.11411.

    [VLM/VLA]

    [16] Yiyang Zhou*, Chenhang Cui*, Jaehong Yoon, Linjun Zhang, Zhun Deng, Chelsea Finn, Mohit Bansal, Huaxiu Yao, Analyzing and Mitigating Object Hallucination in Large Vision-Language Models, in Proceeding of the 12th International Conference on Learning Representations (ICLR 2024), Vienna, Austria, May 2024.

    [VLM/VLA]

    [17] Katherine Tian*, Eric Mitchell*, Allan Zhou, Archit Sharma, Rafael Rafailov, Huaxiu Yao, Chelsea Finn, Christopher D Manning, Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback, in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), Singapore, Dec. 2023.


    [18] Percy Liang, Rishi Bommasani, Tony Lee, [and 47 others, including Huaxiu Yao], Holistic Evaluation of Language Models, Transactions on Machine Learning Research (TMLR, Featured), 2023. (Outstanding Paper Award)

    [Benchmark & Evaluation]

    Foundation Model Applications

    [1] Siwei Han, Peng Xia, Ruiyi Zhang, Tong Sun, Yun Li, Hongtu Zhu, Huaxiu Yao, MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding, arXiv 2503.13964.

    [FM for Document Processing]

    [2] Peng Xia*, Jinglu Wang*, Yibo Peng*, Kaide Zeng, Xian Wu, Xiangru Tang, Hongtu Zhu, Yun Li, Shujie Liu, Yan Lu, Huaxiu Yao, MMedAgent-RL: Optimizing Multi-Agent Collaboration for Multimodal Medical Reasoning, in Proceeding of the 14th International Conference on Learning Representations (ICLR 2026), Rio de Janeiro, Brazil, Apr 2026.

    [FM for Health]

    [3] Haonian Ji*, Shi Qiu*, Siyang Xin*, Siwei Han*, Zhaorun Chen, Hongyi Wang, Dake Zhang, Huaxiu Yao, From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization, in Proceeding of the 14th International Conference on Learning Representations (ICLR 2026), Rio de Janeiro, Brazil, Apr 2026.

    [FM for Education]

    [4] Kangyu Zhu*, Peng Xia*, Yun Li, Hongtu Zhu, Sheng Wang, Huaxiu Yao, MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization, in Proceeding of the Forty-second International Conference on Machine Learning (ICML 2025), Vancouver, Canada, Jul 2025. [arXiv] [Code]

    [FM for Health]

    [5] Peng Xia, Kangyu Zhu, Haoran Li, Tianze Wang, Weijia Shi, Sheng Wang, Linjun Zhang, James Zou, Huaxiu Yao, MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models, in Proceeding of the 13th International Conference on Learning Representations (ICLR 2025), Singapore, Apr 2025. [arXiv] [Code]

    [FM for Health]

    [6] Peng Xia, Ze Chen, Juanxi Tian, Yangrui Gong, Ruibo Hou, Yue Xu, Zhenbang Wu, Zhiyuan Fan, Yiyang Zhou, Kangyu Zhu, Wenhao Zheng, Zhaoyang Wang, Xiao Wang, Xuchao Zhang, Chetan Bansal, Marc Niethammer, Junzhou Huang, Hongtu Zhu, Yun Li, Jimeng Sun, Zongyuan Ge, Gang Li, James Zou, Huaxiu Yao, CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models, in Proceeding of the Thirty-Eighth Conference on Neural Information Processing Systems Track on Datasets & Benchmarks (NeurIPS 2024), Vancouver, Canada, Dec 2024.

    [FM for Health]

    [7] Peng Xia*, Kangyu Zhu*, Haoran Li, Hongtu Zhu, Yun Li, Gang Li, Linjun Zhang, Huaxiu Yao, RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models, in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024), Miami, Nov. 2024.

    [FM for Health]

  • Teaching

    Lecture

    • DATA 523: Modeling and Data Mining For Artificial Intelligence, Spring 2026
    • CS 790-183: Transfer Learning, UNC-CH, Fall 2025
    • DATA 140: Introduction to Data Structures and Management, UNC-CH, Spring 2025
    • CS 590/790-183: Transfer Learning, UNC-CH, Spring 2024
    • CS 790-150: Reliable Machine Learning, UNC-CH, Fall 2023, Fall 2024
    • CS 330: Deep Multi-Task and Meta Learning (Domain Generalization), Stanford University, Fall 2022

    Tutorial

    • Learning with Small Data. (KDD 2020 [Website] [Slides] [YouTube] [Bilibili]) (WSDM 2020 [Website]) (AAAI 2021)
    • Meta-learning and Automated Machine Learning: Approaches and Applications​. (IJCAI 2020)

  • Service

    Conference Area Chair

    • International Conference on Machine Learning (ICML), 2024
    • Conference on Neural Information Processing Systems (NeurIPS), 2024; D&B Track (2022 - 2024)
    • International Conference on Learning Representations (ICLR), 2025
    • International Conference on Artificial Intelligence and Statistics (AISTATS), 2024
    • Empirical Methods in Natural Language Processing  (EMNLP), 2024
    • International Conference on Automated Machine Learning (AutoML-Conf), 2022 - 2024
    • Learning on Graphs Conference (LoG), 2022 - 2024

    Workshop Organizer

  • Contact

    Office: 254, Sitterson Hall, Chapel Hill, NC 27599