ICLR 2024 Workshop on LLM Agents delves into the significance of agents driven by large language models (LLMs), a topic that has recently sparked intense discussions. Building on the current huge progress on LLMs, we'll focus on autonomous agents that perform intricate tasks in both real and simulated environments guided by natural language instructions. What sets these agents apart is their sophisticated use of language prompts, not just as a means of communication but also as a medium for reasoning—a characteristic once thought unique to humans.

Previous Workshop: TamingLLM @ Sigdial & INLG 2023


We will explore a range of topics in this workshop, including, but not limited to, the following areas:

Memory Mechanisms and Linguistic Representation:

This session will analyze the similarities between LLMs and human memory and will discuss the mechanisms of storage and formation of the linguistic representation in LLMs.

Tool Augmentation and Grounding (interaction with environment):

Addressing the enhancement of LLMs through tool augmentation, this session will also include a discourse on grounding – linking natural language concepts to particular contexts.

Reasoning, Planning, and Risks:

This session will discuss the intertwined processes of reasoning and planning in language agents and highlight the potential hazards associated with language agents' ability to autonomously operate in the real world.

Multi-modality and Integration in Language Agents:

This session will explore how language agents can integrate multiple modalities such as vision, sound, and touch to enhance their understanding and interaction with the environment.

Conceptual Framework for Language Agents:

This session will delve into a potential framework for language agents by drawing from both classic and contemporary AI research and related fields such as neuroscience, cognitive science, and linguistics.


Denny Zhou

Principal Scientist/Research Director, Google DeepMind

Luke Zettlemoyer

Professor, Allen School of Computer Science & Engineering, University of Washington

Chelsea Finn

Assistant Professor, Stanford University

Karthik Narasimhan

Assistant Professor, Computer Science, Princeton University

Graham Neubig

Associate Professor, CMU LTI

Joyce Y. Chai

Professor, University of Michigan


Tao Yu

Assistant Professor, The University of Hong Kong

Roberta Raileanu

Research Scientist, Meta GenAI

Alexandre Drouin

Staff Research Scientist, ServiceNow Research

Denny Zhou

Principal Scientist/Research Director, Google DeepMind

Graham Neubig

Associate Professor, CMU LTI

Luke Zettlemoyer

Professor, Allen School of Computer Science & Engineering, University of Washington

Workshop Schedule

Time Session Duration Details
8:40AM - 8:45AM Opening Remarks 5 min Welcome and Introduction to the Workshop
8:45AM - 9:15AM Invited Talk #1 30 min Denny Zhou: LLM Reasoning: Key Ideas and Limitations
9:15AM - 9:25AM Spotlight Presentation #1 10 mins AutoGen
9:25AM - 9:35AM Spotlight Presentation #2 10 mins Data-Copilot
9:35AM - 10:05AM Invited Talk #2 30 min Luke Zettlemoyer: ART: Automatic multi-step reasoning and tool-use for large language models
10:05AM - 10:30AM Coffee Break 25 min Networking and refreshments
10:30AM - 10:40AM Spotlight Presentation #3 10 mins AutoAct
10:40AM - 10:50AM Spotlight Presentation #4 10 mins Large Language Models can Strategically Deceive their Users
10:50AM - 11:20AM Invited Talk #3 30 min Graham Neubig: OpenDevin: A Platform for AI Agents Supporting Software Development, Web Navigation, and Beyond
11:20AM - 12:15PM Poster Session I 55 min First session for display and discussion of all accepted submissions
12:15PM - 1:15PM Lunch Break 60 min Time for lunch and informal discussions
1:15PM - 2:10PM Panel Discussion 55 min Interactive session with panelists: Tao Yu, Roberta Raileanu, Alexandre Drouin, Denny Zhou, Graham Neubig, and Luke Zettlemoyer
2:10PM - 2:20PM Spotlight Presentation #5 10 mins CodeAct
2:20PM - 2:30PM Spotlight Presentation #6 10 mins Exploring Collaboration Mechanisms for LLM Agents
2:30PM - 3:30PM Poster Session II, with Coffee 1 hr Second session for display and discussion of all accepted submissions; Networking and refreshments
3:30PM - 4:00PM Invited Talk #4 30 min Chelsea Finn: Can AI agents learn from high-level feedback?
4:00PM - 4:30PM Invited Talk #5 30 min Karthik Narasimhan
4:30PM - 5:00PM Invited Talk #6 30 min Joyce Chai: LLMs in Connecting Humans and Embodied Agents
5:00PM - 5:05PM Closing Remarks 5 min Concluding the workshop

Call for Papers

Important Dates:

  • Submission Deadline: February 11th, 2024 (11:59 pm AoE)
  • Acceptance Notification: March 3rd, 2024 March 10th, 2024
  • Camera Ready Deadline: April 20th, 2024
  • Paper Availability on Website: April 27th, 2024
  • Workshop Date: May 11th, 2024
  • Location: Vienna Exhibition & Congress Center

Submission Tracks:

Consistent with the themes of the workshop, we invite contributions in the areas highlighted above. However, we emphasize that the topics list is not exhaustive and welcome submissions in related areas. There is no need to specify your track on OpenReview. Our workshop will not accept work that has been previously published in other conferences on machine learning. Work that is presented at the main ICLR conference should not be submitted to us as well.

  • Research Paper Track: We welcome a variety of original research papers, including but not limited to those that propose new techniques, discussion-based papers, literature surveys, and position papers. Research papers can have a maximum length of up to 9 pages of content, plus unlimited pages for references and appendix.
  • Demo Paper Track: We also welcome technical reports for the demo track, with a maximum of 9 pages (same as research papers). In addition to the paper, please provide a link to a video, website, or code repository showcasing your demo.

Submission Guidelines:

  • 🌐 Submission Platform:
  • 📄 Paper Requirements:
    • Use the provided LaTeX template for your submission.
    • Papers should be anonymized and uploaded as a single PDF.
    • 📚 References and Appendix: Reviewers are not obliged to read the appendix.
  • 🔍 Non-Archival Policy:
    • Submissions will not be indexed or have archival proceedings. We welcome ICML 24 or ACL 24 submissions.
    • Accepted papers will be displayed on the workshop website on 27th April 2024.
  • 🔄 Dual Submission Policy:
    • Submissions under review at other venues will be accepted, provided they do not breach any dual-submission or anonymity policies of those venues.
  • 👀 Review Process:
    • The review process is double-blind.
  • 🏆 Best Paper Award:
    • The award for best paper will be announced at the workshop.

Accepted Papers

  • AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation,
    Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, Chi Wang
  • Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow,
    Wenqi Zhang, Yongliang Shen, Weiming Lu, Yueting Zhuang
  • AutoAct: Automatic Agent Learning from Scratch via Self-Planning,
    Shuofei Qiao, Ningyu Zhang, Runnan Fang, Yujie Luo, Wangchunshu Zhou, Yuchen Eleanor Jiang, chengfei lv, Huajun Chen
  • Large Language Models can Strategically Deceive their Users when Put Under Pressure,
    Jérémy Scheurer, Mikita Balesni, Marius Hobbhahn
  • Executable Code Actions Elicit Better LLM Agents,
    Xingyao Wang, Yangyi Chen, Lifan Yuan, Yizhe Zhang, Yunzhu Li, Hao Peng, Heng Ji
  • Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View,
    Jintian Zhang, Xin Xu, Ningyu Zhang, Ruibo Liu, Bryan Hooi, Shumin Deng

  • Towards Unified Alignment Between Agents, Humans, and Environment,
    Zonghan Yang, An Liu, Zijun Liu, Kaiming Liu, Fangzhou Xiong, Yile Wang, Zeyuan Yang, Qingyuan Hu, XinRui Chen, Zhenhe Zhang, Fuwen Luo, Zhicheng Guo, Peng Li, Yang Liu
  • Self-Training Language Models in Arithmetic Reasoning,
    Marek Kadlčík, Michal Štefánik, Ondrej Sotolar, Vlastimil Martinek
  • R2E: Turning any Github Repository into a Programming Agent Test Environment,
    Naman Jain, Manish Shetty, Tianjun Zhang, King Han, Koushik Sen, Ion Stoica
  • Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs,
    Da Yin, Faeze Brahman, Abhilasha Ravichander, Khyathi Chandu, Kai-Wei Chang, Yejin Choi, Bill Yuchen Lin
    Zhaoyi Li, Kelin Yu, Shuo Cheng, Danfei Xu
  • WavCraft: Audio Editing and Generation with Large Language Models,
    Jinhua Liang, Huan Zhang, Haohe Liu, Yin Cao, Qiuqiang Kong, Xubo Liu, Wenwu Wang, Mark D Plumbley, Huy Phan, Emmanouil Benetos
  • SAGE: Bridging Semantic and Actionable Parts for Generalizable Manipulation of Articulated Objects,
    Haoran Geng, Songlin Wei, Congyue Deng, Bokui Shen, He Wang, Leonidas Guibas
  • Simulating Opinion Dynamics with Networks of LLM-based Agents,
    Yun-Shiuan Chuang, Agam Goyal, Nikunj Harlalka, Siddharth Suresh, Robert D. Hawkins, Sijia Yang, Dhavan V. Shah, Junjie Hu, Timothy T. Rogers
  • Agents: An Open-source Framework for Autonomous Language Agents,
    Wangchunshu Zhou, Yuchen Eleanor Jiang, Long Li, Jialong Wu, Tiannan Wang, Shuai Wang, Jiamin Chen, Jintian Zhang, Jing Chen, Xiangru Tang, Peng Cui, Ningyu Zhang, Huajun Chen, Mrinmaya Sachan
  • A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts,
    Kuang-Huei Lee, Xinyun Chen, Hiroki Furuta, John Canny, Ian Fischer
  • The Agent Ohana: Designing Unified Data and Training Pipeline for Effective Agent Learning,
    Jianguo Zhang, Tian Lan, Rithesh R N, Zhiwei Liu, Weiran Yao, Juntao Tan, Yihao Feng, Thai Quoc Hoang, Tulika Manoj Awalgaonkar, Liangwei Yang, Shelby Heinecke, Huan Wang, Juan Carlos Niebles, Silvio Savarese, Caiming Xiong
  • Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning,
    Mohamed Aghzal, Erion Plaku, Ziyu Yao
  • FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design,
    Haohang Li, Yangyang Yu, Zhi Chen, Yuechen Jiang, Yang Li, Denghui Zhang, Rong Liu, Jordan W. Suchow, Khaldoun Khashanah
  • ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL,
    Yifei Zhou, Andrea Zanette, Jiayi Pan, Aviral Kumar, Sergey Levine
  • Beyond A*: Better LLM planning via Search Dynamics Bootstrapping,
    Lucas Lehnert, Sainbayar Sukhbaatar, Paul McVay, Michael Rabbat, Yuandong Tian
  • A-CONECT: Designing AI-based Conversational Chatbot for Early Dementia Intervention,
    Junyuan Hong, Wenqing Zheng, Han Meng, Siqi Liang, Anqing Chen, Hiroko H. Dodge, Jiayu Zhou, Zhangyang Wang
  • Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast,
    Xiangming Gu, Xiaosen Zheng, Tianyu Pang, Chao Du, Qian Liu, Ye Wang, Jing Jiang, Min Lin
  • Large Language Model Evaluation Via Multi AI Agents: Preliminary results,
    Zeeshan Rasheed, Muhammad Waseem, Kari Systä, Pekka Abrahamsson
  • Towards General Computer Control: A Multimodal Agent for Red Dead Redemption II as a Case Study,
    Weihao Tan, Ziluo Ding, Wentao Zhang, Boyu Li, Bohan Zhou, Junpeng Yue, Haochong Xia, Jiechuan Jiang, Longtao Zheng, Xinrun Xu, Yifei Bi, Pengjie Gu, Xinrun Wang, Börje F. Karlsson, Bo An, Zongqing Lu
  • GPT-4V(ision) is a Generalist Web Agent, if Grounded,
    Boyuan Zheng, Boyu Gou, Jihyung Kil, Huan Sun, Yu Su
  • OpenAgents: An Open Platform for Language Agents in the Wild,
    Tianbao Xie, Fan Zhou, Zhoujun Cheng, Peng Shi, Luoxuan Weng, Yitao Liu, Toh Jing Hua, Junning Zhao, Qian Liu, Che Liu, Zeyu Liu, Yiheng Xu, Hongjin SU, Dongchan Shin, Caiming Xiong, Tao Yu
  • OpenFMNav: Towards Open-Set Zero-Shot Object Navigation via Vision-Language Foundation Models,
    Yuxuan Kuang, Hai Lin, Meng Jiang
  • TravelPlanner: A Benchmark for Real-World Planning with Language Agents,
    Jian Xie, Kai Zhang, Jiangjie Chen, Tinghui Zhu, Renze Lou, Yuandong Tian, Yanghua Xiao, Yu Su
  • Empowering Autonomous Driving with Large Language Models: A Safety Perspective,
    Yixuan Wang, Ruochen Jiao, Simon Sinong Zhan, Chengtian Lang, Chao Huang, Zhaoran Wang, Zhuoran Yang, Qi Zhu
  • REX: Rapid Exploration and eXploitation for AI agents,
    Rithesh R N, Shelby Heinecke, Juan Carlos Niebles, Zhiwei Liu, Le Xue, Weiran Yao, Yihao Feng, Zeyuan Chen, Akash Gokul, Devansh Arpit, Ran Xu, Phil L Mui, Huan Wang, Caiming Xiong, Silvio Savarese
  • Towards Natural Language-Driven Industrial Assembly Using Foundation Models,
    Omkar Joglekar, Shir Kozlovsky, Tal Lancewicki, Vladimir Tchuiev, Zohar Feldman, Dotan Di Castro
  • Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception,
    Junyang Wang, Haiyang Xu, Jiabo Ye, Ming Yan, Weizhou Shen, Ji Zhang, Fei Huang, Jitao Sang
  • Exposing Limitations of Language Model Agents in Sequential-Task Compositions on the Web,
    Hiroki Furuta, Yutaka Matsuo, Aleksandra Faust, Izzeddin Gur
  • LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models,
    Shibo Hao, Yi Gu, Haotian Luo, Tianyang Liu, Xiyan Shao, Xinyuan Wang, Shuhua Xie, Haodi Ma, Adithya Samavedhi, Qiyue Gao, Zhen Wang, Zhiting Hu
  • R-Judge: Benchmarking Safety Risk Awareness for LLM Agents,
    Tongxin Yuan, Zhiwei He, Lingzhong Dong, Yiming Wang, Ruijie Zhao, Tian Xia, Lizhen Xu, Binglin Zhou, Li Fangqi, Zhuosheng Zhang, Rui Wang, Gongshen Liu
  • LLF-Bench: Benchmark for Interactive Learning from Language Feedback,
    Ching-An Cheng, Andrey Kolobov, Dipendra Misra, Allen Nie, Adith Swaminathan
  • LLM-Deliberation: Evaluating LLMs with Interactive Multi-Agent Negotiation Game,
    Sahar Abdelnabi, Amr Gomaa, Sarath Sivaprasad, Lea Schönherr, Mario Fritz
  • Is it Possible to Edit Large Language Models Robustly?,
    Xinbei Ma, Tianjie Ju, Jiyang Qiu, Zhuosheng Zhang, hai zhao, lifeng Liu, Yulong Wang
  • Agent Instructs Large Language Models to be General Zero-Shot Reasoners,
    Nicholas Crispino, Kyle Montgomery, Fankun Zeng, Dawn Song, Chenguang Wang
  • WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?,
    Alexandre Drouin, Maxime Gasse, Massimo Caccia, Issam H. Laradji, Manuel Del Verme, Tom Marty, David Vazquez, Nicolas Chapados, Alexandre Lacoste
  • Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration,
    Qiushi Sun, Zhangyue Yin, Xiang Li, Zhiyong Wu, Xipeng Qiu, Lingpeng Kong
  • ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning,
    Alireza Ghafarollahi, Markus Buehler
  • Hierarchical Auto-Organizing System for Open-Ended Multi-Agent Navigation,
    Zhonghan Zhao, Kewei Chen, Dongxu Guo, Wenhao Chai, Tian Ye, Yanting Zhang, Gaoang Wang
  • EHRAgent: Code Empowers Large Language Models for Few-shot Complex Tabular Reasoning on Electronic Health Records,
    Wenqi Shi, Ran Xu, Yuchen Zhuang, Yue Yu, Jieyu Zhang, Hang Wu, Yuanda Zhu, Joyce C. Ho, Carl Yang, May Dongmei Wang
  • Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models,
    Zhiyuan Hu, Chumin Liu, Xidong Feng, Yilun Zhao, See-Kiong Ng, Anh Tuan Luu, Junxian He, Pang Wei Koh, Bryan Hooi
  • TaskBench: Benchmarking Large Language Models for Task Automation,
    Yongliang Shen, Kaitao Song, Xu Tan, Wenqi Zhang, Kan Ren, Siyu Yuan, Weiming Lu, Dongsheng Li, Yueting Zhuang
  • SELF-IMAGINE: Effective Unimodal Reasoning with Multimodal Models using Self-Imagination,
    Syeda Nahida Akter, Aman Madaan, Sangwu Lee, Yiming Yang, Eric Nyberg
  • BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments,
    Yusuf H Roohani, Jian Vora, Qian Huang, Percy Liang, Jure Leskovec
    Lin Xu, Zhiyuan Hu, Daquan Zhou, Hongyu Ren, Zhen Dong, Kurt Keutzer, See-Kiong Ng, Jiashi Feng
  • Do LLM Agents Have Regret? A Case Study in Online Learning and Games,
    Chanwoo Park, Xiangyu Liu, Asuman E. Ozdaglar, Kaiqing Zhang
  • Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science,
    Xiangru Tang, Qiao Jin, Kunlun Zhu, Tongxin Yuan, Yichi Zhang, Wangchunshu Zhou, Meng Qu, Yilun Zhao, Jian Tang, Zhuosheng Zhang, Arman Cohan, Zhiyong Lu, Mark Gerstein
  • Expressing and Exploiting Parallelism in Language Model Decoding,
    Tian Jin, Ellie Y Cheng, Michael Carbin
  • Towards Self-Improving Language Models for Code Generation,
    Michaël Defferrard, Corrado Rainone, David W. Zhang, Blazej Manczak, Natasha Butt, Taco Cohen
  • MathChat: Converse to Tackle Challenging Math Problems with LLM Agents,
    Yiran Wu, Feiran Jia, Shaokun Zhang, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Qingyun Wu, Chi Wang
  • L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects,
    Yutaro Yamada, Khyathi Chandu, Bill Yuchen Lin, Jack Hessel, Ilker Yildirim, Yejin Choi
  • An Embodied Generalist Agent in 3D World,
    Jiangyong Huang, Silong Yong, Xiaojian Ma, Xiongkun Linghu, Puhao Li, Yan Wang, Qing Li, Song-Chun Zhu, Baoxiong Jia, Siyuan Huang
  • Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization,
    Wenqi Zhang, Ke Tang, Hai Wu, Mengna Wang, Yongliang Shen, Guiyang Hou, Zeqi Tan, Peng Li, Yueting Zhuang, Weiming Lu
  • Recursive Speculative Decoding: Accelerating LLM Inference via Sampling Without Replacement,
    Wonseok Jeon, Mukul Gagrani, Raghavv Goel, Junyoung Park, Mingu Lee, Christopher Lott
  • VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks,
    Jing Yu Koh, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Ruslan Salakhutdinov, Daniel Fried
  • HELPER-X: A Unified Instructable Embodied Agent to Tackle Four Interactive Vision-Language Domains with Memory-Augmented Language Models,
    Gabriel Herbert Sarch, Sahil Somani, Raghav Kapoor, Michael J. Tarr, Katerina Fragkiadaki
  • Controlling Large Language Model-based Agents for Large-Scale Decision-Making: An Actor-Critic Approach,
    Bin Zhang, Hangyu Mao, Jingqing Ruan, Ying Wen, Yang Li, Shao Zhang, Zhiwei Xu, Dapeng Li, Ziyue Li, Rui Zhao, Lijuan Li, Guoliang Fan
  • Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks,
    Murtaza Dalal, Tarun Chiruvolu, Devendra Singh Chaplot, Ruslan Salakhutdinov
  • Adapting Uni-Modal Language Models for Dense Multi-Modal Co-Reference Resolution using Parameter Augmentation,
    Samuel Osebe, Prashan Wanigasekara, Thanh Tran, Thomas Gueudre
  • Preference-Conditioned Language-Guided Abstraction,
    Andi Peng, Andreea Bobu, Belinda Z. Li, Theodore Sumers, Ilia Sucholutsky, Nishanth Kumar, Thomas L. Griffiths, Julie Shah
  • S-Agent: self-organizing agents in open-ended environment,
    Jiaqi Chen, Yuxian Jiang, Jiachen Lu, Li Zhang
  • Efficient Human-AI Coordination via Preparatory Language-based Convention,
    Cong Guan, Lichao Zhang, Chunpeng Fan, Yi-Chen Li, Feng Chen, Lihe Li, Yunjia Tian, Lei Yuan, Yang Yu
  • SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents,
    Kanzhi Cheng, Qiushi Sun, Yougang Chu, Fangzhi Xu, Li YanTao, Jianbing Zhang, Zhiyong Wu
  • The ART of LLM Refinement: Ask, Refine, Trust,
    Kumar Shridhar
  • SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code,
    Ziniu Hu
  • LangProp: A code optimization framework using Large Language Models applied to driving,
    Shu Ishida, Gianluca Corrado, George Fedoseev, Hudson Yeo, Lloyd Russell, Jamie Shotton, Joao F. Henriques, Anthony Hu
  • FL-TAC: Enhanced Fine-Tuning in Federated Learning via Low-Rank, Task-Specific Adapter Clustering,
    Siqi Ping, Yuzhu Mao, Yang Liu, Xiao-Ping Zhang, Wenbo Ding
  • EcoAssistant: Using LLM Assistants More Affordably and Accurately,
    Jieyu Zhang, Ranjay Krishna, Ahmed Hassan Awadallah, Chi Wang
  • IntentGPT: Few-shot Intent Discovery with Large Language Models,
    Juan A. Rodriguez, Nicholas Botzer, David Vazquez, Christopher Pal, Marco Pedersoli, Issam H. Laradji
  • Language-guided Skill Learning with Temporal Variational Inference,
    Haotian Fu, Pratyusha Sharma, Elias Stengel-Eskin, George Konidaris, Nicolas Le Roux, Marc-Alexandre Côté, Xingdi Yuan
  • Decision-Oriented Dialogue for Human-AI Collaboration,
    Jessy Lin, Nicholas Tomlin, Jacob Andreas, Jason Eisner
  • Making Retrieval-Augmented Language Models Robust to Irrelevant Context,
    Ori Yoran, Tomer Wolfson, Ori Ram, Jonathan Berant
  • MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning,
    Xiangru Tang, Anni Zou, Zhuosheng Zhang, Ziming Li, Yilun Zhao, Xingyao Zhang, Arman Cohan, Mark Gerstein
  • Collaborative LLM-Agents for Editable Driving Scene Simulation,
    Yuxi Wei, Zi Wang, Yifan Lu, Chenxin Xu, Changxing Liu, Hao Zhao, Siheng Chen, Yanfeng Wang
  • WebLINX: Real-World Website Navigation with Multi-Turn Dialogue,
    Xing Han Lu, Zdeněk Kasner, Siva Reddy
  • The Wisdom of Partisan Crowds: Comparing Collective Intelligence in Humans and LLM-based Agents,
    Yun-Shiuan Chuang, Nikunj Harlalka, Siddharth Suresh, Agam Goyal, Robert D. Hawkins, Sijia Yang, Dhavan V. Shah, Junjie Hu, Timothy T. Rogers
    Zhiwei Liu, Weiran Yao, Jianguo Zhang, Le Xue, Shelby Heinecke, Rithesh R N, Yihao Feng, Zeyuan Chen, Juan Carlos Niebles, Devansh Arpit, Ran Xu, Phil L Mui, Huan Wang, Caiming Xiong, Silvio Savarese
  • Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems,
    Yilun Kong, Jingqing Ruan, YiHong Chen, Bin Zhang, Tianpeng Bao, shi shiwei, du guo qing, xiaoru hu, Hangyu Mao, Ziyue Li, Xingyu Zeng, Rui Zhao, Xueqian Wang
  • Self-Alignment of Large Language Models via Multi-Agent Social Simulation,
    Xianghe Pang, Shuo Tang, Rui Ye, Yuxin Xiong, Bolun Zhang, Yanfeng Wang, Siheng Chen
  • If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents,
    Ke Yang, Jiateng Liu, John Wu, Chaoqi Yang, Yi Fung, Sha Li, Zixuan Huang, Xu Cao, Xingyao Wang, Heng Ji, ChengXiang Zhai
  • ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent,
    Renat Aksitov, Sobhan Miryoosefi, Zonglin Li, Daliang Li, Sheila Babayan, Kavya Kopparapu, Zachary Fisher, Ruiqi Guo, Sushant Prakash, Pranesh Srinivasan, Manzil Zaheer, Felix Yu, Sanjiv Kumar
  • Are Machines Better at Slow Thinking? Unveiling Human-Machine Inference Gaps in Entailment Verification,
    Soumya Sanyal, Tianyi Xiao, Jiacheng Liu, Wenya Wang, Xiang Ren
  • Limitations of Agents Simulated by Predictive Models,
    Raymond Douglas, Jacek Karwowski, Chan Bae, Andis Draguns, Victoria Krakovna
  • OS-Copilot: Towards Generalist Computer Agents with Self-Improvement,
    Zhiyong Wu, Chengcheng Han, Zichen Ding, Zhenmin Weng, Zhoumianze Liu, Shunyu Yao, Tao Yu, Lingpeng Kong
  • EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction,
    Siyu Yuan, Kaitao Song, Jiangjie Chen, Xu Tan, Yongliang Shen, Kan Ren, Dongsheng Li, Deqing Yang
  • FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets,
    Seonghyeon Ye, Doyoung Kim, Sungdong Kim, Hyeonbin Hwang, Seungone Kim, Yongrae Jo, James Thorne, Juho Kim, Minjoon Seo
  • Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models,
    Andy Zhou, Kai Yan, Michal Shlapentokh-Rothman, Haohan Wang, Yu-Xiong Wang
  • On the Road with GPT-4V(ision): Explorations of Utilizing Visual-Language Model as Autonomous Driving Agent,
    Licheng Wen, Xuemeng Yang, Daocheng Fu, Xiaofeng Wang, Pinlong Cai, Xin Li, Tao MA, Yingxuan Li, Linran XU, Dengke Shang, Zheng Zhu, Shaoyan Sun, Yeqi BAI, Xinyu Cai, Min Dou, Shuanglu Hu, Botian Shi, Yu Qiao
  • Bring Your Own KG: Self-Supervised Program Synthesis for Zero-Shot KGQA,
    Dhruv Agarwal, Rajarshi Das, Sopan Khosla, Rashmi Gangadharaiah
  • Open-TI: Open Traffic Intelligence with Augmented Language Model,
    Longchao Da, Kuan-Ru Liou, Tiejin Chen, Xuesong Zhou, Xiangyong Luo, Yezhou Yang, Hua Wei
  • AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents,
    Chang Ma, Junlei Zhang, Zhihao Zhu, Cheng Yang, Yujiu Yang, Yaohui Jin, Zhenzhong Lan, Lingpeng Kong, Junxian He


Workshop Organizers

Organizing Commitee

Xinyun Chen

Senior Research Scientist, Google DeepMind

Xiangru Robert Tang

Ph.D. student @ Yale University

Di Jin

Senior Applied Scientist @ Amazon

Devamanyu Hazarika

Applied Scientist @ Amazon

Daniel Fried

Assistant Professor, CMU LTI

Dawn Song

Professor, University of California, Berkeley

Shafiq Joty

Research Director, Salesforce Research

Meredith Ringel Morris

Director for Human-AI Interaction Research, Google DeepMind

Program Committee

  • Yuelyu Ji, University of Pittsburgh
  • Hangyu Mao, Sensetime Research
  • Boyuan Zheng, Ohio State University, Columbus
  • Siyu Yuan, Fudan University
  • Xin Cong, Tsinghua University, Tsinghua University
  • Markus Buehler, Massachusetts Institute of Technology
  • Lin Xu, National University of Singapore
  • Chenfei Yuan, Department of Computer Science and Technology, Tsinghua University
  • Haochen Vector Zhao, Peking University
  • Feiran Jia, Pennsylvania State University
  • Yao Yao, Shanghai Jiaotong University
  • Zhang Ruichen, Nanyang Technological University
  • Mathieu Ravaut, Nanyang Technological University
  • Zirui Zhao, national university of singaore, National University of Singapore
  • Jialong Wu, Southeast University
  • Rithesh R N, SalesForce.com
  • Juntao Tan, Rutgers University
  • Ting Chen, University of Electronic Science and Technology of China
  • Yun-Shiuan Chuang, University of Wisconsin - Madison
  • Jiageng Mao, University of Southern California
  • Yongliang Shen, Microsoft
  • Zhiruo Wang, Carnegie Mellon University
  • Jiuzhou Han, Monash University
  • Kaixin Ma, Tencent AI Lab
  • Hao Peng, Department of Computer Science, University of Illinois Urbana-Champaign
  • Jian Guan, Tsinghua University, Tsinghua University
  • Shaoguang Mao, Microsoft
  • Olivia Watkins, University of California Berkeley
  • Jiateng Liu, Department of Computer Science
  • Qian Huang, Google
  • Haozhe Zhao, Peking University
  • Yecheng Jason Ma, University of Pennsylvania
  • Zhenran Xu, Harbin Institute of Technology, Shenzhen
  • Zhongshen Zeng, Department of Computer Science and Engineering, The Chinese University of Hong Kong
  • Kuang-Huei Lee, Google
  • Chunyuan Deng, Georgia Institute of Technology
  • Meghana Moorthy Bhat, Salesforce Research
  • Tianjun Zhang, University of California Berkeley
  • Jiangyong Huang, Peking University
  • Wenshan Wu, Microsoft
  • Kimin Lee, Korea Advanced Institute of Science & Technology
  • Daquan Zhou, Bytedance
  • Haoqi Yuan, Peking University
  • Osbert Bastani, University of Pennsylvania
  • Shuyan Zhou, Carnegie Mellon University
  • Agam Goyal, University of Wisconsin - Madison
  • Gang Qiao, Siemens Healthineers
  • Xun Wang, Microsoft
  • Sahitya Potluri, Google
  • Xingyao Wang, Department of Computer Science, University of Illinois Urbana-Champaign
  • Wenyue Hua, Rutgers University, New Brunswick
  • Younggyo Seo, Dyson
  • Zhangcheng Qiang, Australian National University
  • Boyu Gou, Ohio State University, Columbus
  • Jian Xie, Fudan University
  • Ziniu Hu, California Institute of Technology
  • Yichi Zhang, Peking University
  • Fangkai Jiao, Nanyang Technological University
  • Yangyi Chen, School of Computer Science, University of Illinois at Urbana-Champaign
  • Ravi Pandya, Carnegie Mellon University
  • Zelong Li, Rutgers University, New Brunswick
  • Jiayuan Mao, Massachusetts Institute of Technology
  • Bohan Lyu, Tsinghua University
  • Senbao Shi, Harbin Institute of Technology
  • Kaitao Song, Microsoft
  • Nikunj Harlalka, University of Wisconsin - Madison
  • Zhihan Liu, Northwestern University
  • Haochun Wang, Harbin Institute of Technology
  • Chi Zhang, Tencent
  • Chang Gao, The Chinese University of Hong Kong
  • Kun Shao, Huawei Noah's Ark Lab
  • Lanqing Li, Zhejiang Lab
  • Ziyuan Qin, Case Western Reserve University
  • Chengjie Zheng, University of Massachusetts Boston
  • Bharat Prakash, University of Maryland, Baltimore County
  • Yanjun Shao, Fudan University
  • Amrita Saha, SalesForce.com
  • Ke Yang, Department of Computer Science
  • Zhao Xu, Hong Kong University of Science and Technology
  • Ruochen Zhao, Nanyang Technological University
  • Chaoqi Yang, University of Illinois Urbana Champaign
  • Hao Wang, Google
  • Yangyang Yu, Stevens Institute of Technology
  • Shuofei Qiao, Zhejiang University
  • Hailin Chen, National Technological University
  • Yuan Yao, Nanjing University
  • Lei Liu, The Chinese University of Hong Kong, Shenzhen
  • Yuechen Jiang, Stevens Institute of Technology
  • Pengguang Chen, SmartMore
  • Chuan Xiao, Osaka University
  • Sha Li, University of Illinois, Urbana Champaign
  • Wenqi Zhang, Zhejiang University
  • Yilun Zhao, Yale University
  • Kaikai An, Peking University
  • Yunhao Yang, University of Texas at Austin
  • Haohang Li, Stevens Institute of Technology
  • Jianghao Zhang, University of Michigan - Ann Arbor
  • Shruti Singh, IIT Gandhinagar
  • Zhi Chen, Stevens Institute of Technology
  • He Zhu, Rutgers University
  • Allen Nie, Stanford University
  • Shuzheng Si, Peking University
  • Muhammad Waseem, University of Jyväskylä
  • Jing Yu Koh, Carnegie Mellon University
  • Kunlun Zhu, Université de Montréal
  • Chengwei Qin, Nanyang Technological University
  • Zengqing Wu, Kyoto University
  • Vernon Bumgardner, University of Kentucky
  • Chenyang Zhao, Zhejiang Lab
  • Rong Liu, Stevens Institute of Technology
  • Sihao Hu, Georgia Institute of Technology
  • Srijan Bansal, Carnegie Mellon University
  • Da Yin, University of California, Los Angeles
  • Hung Le, Salesforce Research
  • Enxhell Luzhnica, Google
  • Michelle D Zhao, CMU, Carnegie Mellon University
  • Yunfan Jiang, Stanford University
  • Hongyang Du, Nanyang Technological University
  • Jason Phang, New York University
  • Xingxuan Li, Nanyang Technological University
  • Mingqi Gao, Peking University
  • Xiao Han, Peking University
  • Haojie Pan, Department of Computer Science and Engineering, Hong Kong University of Science and Technology
  • Pekka Abrahamsson, Tampere University
  • Haibin Huang, Kuaishou Technology
  • Yiming Zhang, Tokyo Institute of Technology, Tokyo Institute of Technology
  • Baotian Hu, Harbin Institute of Technology, Shenzhen
  • Yang Yuan, Tsinghua University, Tsinghua University
  • Yixin Zhang, Kyoto University, Kyoto University
  • Riccardo Cantini, University of Calabria
  • Tiankai Hang, Southeast University
  • Gongshen Liu, Shanghai Jiao Tong University
  • Yuzhou Du, Northwestern University
  • Xiaocheng Lu, Hong Kong Polytechnic University
  • Sarang Gupta, Asana
  • Inderjeet Jayakumar Nair, University of Michigan - Ann Arbor
  • Gabrielle Kaili-May Liu, Department of Computer Science, Yale University
  • Shuyuan Zheng, Osaka University
  • Run Peng, University of Michigan - Ann Arbor
  • Mira Moukheiber, Massachusetts Institute of Technology
  • John Wu, University of Illinois at Urbana-Champaign
  • Bin Liu, Zhejiang Lab


The Best Paper Award:

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, Chi Wang