Shuyan Zhou

Shuyan Zhou
github | twitter | linkedin
google scholar
About Me
Publications
Academic Service
Teaching
Experience
CV

Hi, I’m Shuyan, a final-year PhD student at CMU LTI. I am fortunately advised by Graham Neubig. In Fall 2025, I will be an Assistant Professor at Duke Computer Science, and I will be taking students in the coming cycle (deadline 12/14/2024).

I work on building autonomous agents that could understand high-level language commands. My goal is to create AI agents that would free human beings from tedious tasks and aid them in better decision makings.

I am best reached by email at shuyanzh@cs.cmu.edu.


Publications

* indicates equal contribution, ^ indicates mentorship

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu
Preprint, 2024
[Paper] [Project Site] [Twitter]

WebCanvas: Benchmarking Web Agents in Online Environments
Yichen Pan, Dehan Kong, Sida Zhou, Cheng Cui, Yifei Leng, Bing Jiang, Hangyu Liu, Yanyi Shang, Shuyan Zhou^, Tongshuang Wu, Zhengyang Wu
Preprint, 2024
[Paper] [Platform]

VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks
Jing Yu Koh, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Ruslan Salakhutdinov, Daniel Fried
ACL, 2024
[Paper] [Project Site] [Twitter] [WIRED Article]

WebArena: A Realistic Web Environment for Building Autonomous Agents
Shuyan Zhou*, Frank F. Xu*, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig
ICLR, 2024
[Paper][Project Site][Twitter]

DocPrompting: Generating Code by Retrieving the Docs
Shuyan Zhou, Uri Alon, Frank F. Xu, Zhiruo Wang, Zhengbao Jiang, Graham Neubig
ICLR, 2023 (spotlight)
[Paper] [Code+Data]

PaL: Program-aided Language Models
Luyu Gao*, Aman Madaan*, Shuyan Zhou*, Uri Alon, Pengfei Liu, Yiming Yang, Jamie Callan, Graham Neubig
ICML, 2023
[Paper][Project Site][Twitter][Demo]

Hierarchical Prompting Assists Large Language Model on Web Navigation
Abishek Sridhar*, Robert Lo*, Frank F. Xu, Hao Zhu, Shuyan Zhou^
Findings of EMNLP, 2023
[Paper][Code]

CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
Shuyan Zhou*, Uri Alon*, Sumit Agarwal, Graham Neubig
EMNLP 2023
Deep Learning for Code Workshop at ICLR, 2023 (spotlight)
[Paper][Code]

Execution-Based Evaluation for Open-Domain Code Generation
Zhiruo Wang, Shuyan Zhou, Daniel Fried, Graham Neubig
Findings of EMNLP, 2023
[Paper][Project Site]

Causal Reasoning of Entities and Events in Procedural Texts
Li Zhang*, Hainiu Xu*, Yue Yang, Shuyan Zhou, Weiqiu You, Manni Arora, Chris Callison-Burch
Findings of EACL, 2023
[Paper][Code+Data]

MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages
Zhiruo Wang* , Grace Cuenca*, Shuyan Zhou^, Frank F. Xu, Graham Neubig
Findings of EACL, 2023
[Paper] [Code+Data]

Bridging the gap: A Survey on Integrating (Human) Feedback for Natural Language Generation
Patrick Fernandes, Aman Madaan, Emmy Liu, António Farinhas, Pedro Henrique Martins, Amanda Bertsch, José GC de Souza, Shuyan Zhou, Tongshuang Wu, Graham Neubig, André FT Martins
TACL, 2023
[Paper]

Language Models of Code are Few-Shot Commonsense Learners
Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, Graham Neubig
EMNLP, 2022
[Paper] [Code]

Show Me More Details: Discovering Hierarchies of Procedures from Semi-structured Web Data
Shuyan Zhou*, Li Zhang*, Yue Yang, Qing Lyu, Pengcheng Yin, Chris Callison-Burch, Graham Neubig
ACL, 2022
[Paper] [Code+Data] [Demo]

Procedures as Programs: Hierarchical Control of Situated Agents through Natural Language
Shuyan Zhou, Pengcheng Yin, Graham Neubig
Structured and Unstructured Knowledge Integration Workshop at NAACL, 2022
[Paper]

Soft Gazetteers for Low-Resource Named Entity Recognition
Shruti Rijhwani, Shuyan Zhou, Graham Neubig, Jaime Carbonell
ACL, 2020
[Paper] [Code+Data]

Improving Candidate Generation for Low-resource Cross-lingual Entity Linking
Shuyan Zhou, Shruti Rijhwani, John Wieting, Jaime Carbonell, Graham Neubig
TACL, 2020
[Paper] [Code]

Towards Zero-resource Cross-lingual Entity Linking
Shuyan Zhou, Shruti Rijhwani, Graham Neubig
Deep Learning for Low-Resource NLP Workshop at EMNLP, 2019
[Paper] [Code]

Improving Robustness of Neural Machine Translation with Multi-task Learning
Shuyan Zhou, Xiangkai Zeng, Yingqi Zhou, Antonios Anastasopoulos, Graham Neubig
Conference on Machine Translation (WMT), 2019
[Paper] [Code]

Aggregated Semantic Matching for Short Text Entity Linking
Feng Nie, Shuyan Zhou, Jing Liu, Jinpeng Wang, Chin-Yew Lin, Rong Pan
CoNLL, 2018
[Paper]


Academic Service


Teaching


Experience

Master → Ph.D. of Language Technologies, Carnegie Mellon University
2018.08 - Present
Advisor: Graham Neubig

Ph.D. Resident, X, the moonshot factory
2022.05 - 2022.08
Host: Alex Polozov

Research Intern, Microsoft
2020.05 - 2020.08
Host: Kaushik Chakrabarti

Research Intern, Microsoft Research Asia
2017.07 - 2018.06
Host: Chin-Yew Lin