Thanks to visit codestin.com
Credit goes to github.com

Skip to content
@Ziyu-Yao-NLP-Lab

Ziyu Yao-NLP-Lab

Ziyu Yao NLP Lab

This is the GitHub account of the NLP Lab led by Prof. Ziyu Yao at George Mason University, Department of Computer Science.

Group Webpage: https://ziyuyao.org/group/

Project Highlights 🌟

Repository Inventory 🔥

Topic 1: Reasoning and Planning

  • Revisiting Prompt Optimization with Large Reasoning Models-A Case Study on Event Extraction, Preprint 2025 Paper
  • Evaluating Vision-Language Models as Evaluators in Path Planning, IEEE/CVF CVPR, 2025 Paper Code Dataset
  • A Survey on Large Language Models for Automated Planning, Preprint 2025 Paper
  • Instruction-Tuning LLMs for Event Extraction with Annotation Guidelines, Findings of ACL 2025. PaperCode
  • Efficient but Vulnerable: Benchmarking and Defending LLM Batch Prompting Attack, ACL Findings, 2025. Paper Code
  • DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search, ICLR, 2025. Paper Code
  • Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning, ICLR Workshop on LLM Agents, 2024 Paper Code
  • Look Further Ahead: Testing the Limits of GPT-4 in Path Planning, IEEE 20th International Conference on Automation Science and Engineering, 2024 Paper Code
  • Instances Need More Care: Rewriting Prompts for Instances with LLMs in the Loop Yields Better Zero-Shot Performance, ACL Findings 2024 Paper
  • Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning, ICLR, 2024. Paper Code
  • MailEx: Email event and argument extraction, EMNLP 2023. PaperDataset/Code
  • Gentopia: A Collaborative Platform for Tool-Augmented LLMs, EMNLP Demo, 2023. Paper Code

Topic 2: LLM Interpretability

  • A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models, Preprint 2025. Paper
  • Mechanistic Understanding of Language Models in Syntactic Code Completion, AAAI KnowFM Workshop 2025. Paper
  • Understanding the Effect of Algorithm Transparency of Model Explanations in Text-to-SQL Semantic Parsing, Preprint 2025. Paper
  • An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs, ACL 2024. Paper Code
  • Explaining Large Language Model-Based Neural Semantic Parsers, AAAI 2023. Student Abstract

Topic 3: Human-AI Interaction

  • IntelliExplain: Enhancing Conversational Code Generation for Non-Professional Programmers, Preprint, 2024. Website Paper
  • Learning to Simulate Natural Language Feedback for Interactive Semantic Parsing, ACL 2023. Paper Code

Topic 4: LLM for X (Interdisciplinary Applications)

  • MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education, AAAI AI4Edu Workshop, 2025. Website Paper Code
  • Can LLMs Simulate Personas with Reversed Performance? A Benchmark for Counterfactual Instruction Following, Preprint, 2025. Paper Code

Popular repositories Loading

  1. PyCode-TextEE PyCode-TextEE Public

    Python 4 1

  2. MathVC-NSF.github.io MathVC-NSF.github.io Public

    Forked from MurongYue/MathVC.github.io

    Website of the MathVC NSF Project

    JavaScript

  3. .github .github Public

    1

  4. ICML25-MI-Tutorial.github.io ICML25-MI-Tutorial.github.io Public

    JavaScript

  5. Counterfactual-Persona-Simulation Counterfactual-Persona-Simulation Public

    This repo contains code and experiment setup for our paper "Can LLMs Simulate Personas with Reversed Performance? A Systematic Investigation for Counterfactual Instruction Following in Math Reasoni…

    Python

  6. failure-by-interference failure-by-interference Public

    This repo contains code and experiment setup for our paper "An Failure by Interference: Language Models Make Balanced Parentheses Errors When Faulty Mechanisms Overshadow Sound Ones".

Repositories

Showing 6 of 6 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…