Ziyu Yao-NLP-Lab

Ziyu Yao NLP Lab

ICML 2025 Tutorial on Mechanistic Interpretability for Language Models. Website: https://ziyu-yao-nlp-lab.github.io/ICML25-MI-Tutorial.github.io/
MathVC NSF Project on building an LLM Agent-powered platform for Mathematics Education. Website: https://ziyu-yao-nlp-lab.github.io/MathVC-NSF.github.io/

Revisiting Prompt Optimization with Large Reasoning Models-A Case Study on Event Extraction, Preprint 2025 Paper
Evaluating Vision-Language Models as Evaluators in Path Planning, IEEE/CVF CVPR, 2025 Paper Code Dataset
A Survey on Large Language Models for Automated Planning, Preprint 2025 Paper
Instruction-Tuning LLMs for Event Extraction with Annotation Guidelines, Findings of ACL 2025. Paper Code
Efficient but Vulnerable: Benchmarking and Defending LLM Batch Prompting Attack, ACL Findings, 2025. Paper Code
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search, ICLR, 2025. Paper Code
Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning, ICLR Workshop on LLM Agents, 2024 Paper Code
Look Further Ahead: Testing the Limits of GPT-4 in Path Planning, IEEE 20th International Conference on Automation Science and Engineering, 2024 Paper Code
Instances Need More Care: Rewriting Prompts for Instances with LLMs in the Loop Yields Better Zero-Shot Performance, ACL Findings 2024 Paper
Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning, ICLR, 2024. Paper Code
MailEx: Email event and argument extraction, EMNLP 2023. Paper Dataset/Code
Gentopia: A Collaborative Platform for Tool-Augmented LLMs, EMNLP Demo, 2023. Paper Code

A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models, Preprint 2025. Paper
Mechanistic Understanding of Language Models in Syntactic Code Completion, AAAI KnowFM Workshop 2025. Paper
Understanding the Effect of Algorithm Transparency of Model Explanations in Text-to-SQL Semantic Parsing, Preprint 2025. Paper
An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs, ACL 2024. Paper Code
Explaining Large Language Model-Based Neural Semantic Parsers, AAAI 2023. Student Abstract

IntelliExplain: Enhancing Conversational Code Generation for Non-Professional Programmers, Preprint, 2024. Website Paper
Learning to Simulate Natural Language Feedback for Interactive Semantic Parsing, ACL 2023. Paper Code

MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education, AAAI AI4Edu Workshop, 2025. Website Paper Code
Can LLMs Simulate Personas with Reversed Performance? A Benchmark for Counterfactual Instruction Following, Preprint, 2025. Paper Code