Shuai Shao

Incoming M.Sc. student in EE at SJTU, advised by Prof. Weinan Zhang

I am currently an undergraduate researcher at Shanghai Jiao Tong University. My research focuses on RL for LLMs and agents, agent safety and alignment, multi-agent systems, and reinforcement learning theory.

  • RL for LLMs and Agents
  • Data Synthesis
  • Agent Safety & Alignment
  • RL Theory

Next step

SJTU EE M.Sc.

Joining as an incoming M.Sc. student advised by Prof. Weinan Zhang.

2026 cycle

4 accepted papers

Two papers accepted to ICML 2026 and two papers accepted to ICLR 2026.

Open source

1,500+ downloads, 450+ stars

RiOSWorld on Hugging Face and AgentDoG on GitHub.

Recent milestones

2026 | ICML

Two papers accepted by ICML 2026

MonoScale: Scaling Multi-Agent System with Monotonic Improvement and Are Your Agents Upward Deceivers were accepted to ICML 2026.

2026 | ICLR

Two papers accepted to ICLR 2026

Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents and Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration were accepted to ICLR 2026.

2025 | NeurIPS

Two papers accepted to NeurIPS 2025

RiOSWorld: Benchmarking the Risk of Multimodal Computer-Use Agents and AgentNet: Decentralized Evolutionary Coordination for LLM-based Multi-Agent Systems were accepted to NeurIPS 2025.

2025 | ICML

One paper accepted to ICML 2025

Extreme Value Policy Optimization for Safe Reinforcement Learning was accepted to ICML 2025.

Academic background

Shanghai Jiao Tong University

Incoming M.Sc. in Electrical Engineering

Advised by Prof. Weinan Zhang

Currently an undergraduate researcher at SJTU, and preparing to continue at SJTU EE for the M.Sc. program.

Research interests

From agent capability to agent reliability

Current focus

RL for LLMs and agents, data synthesis, LLM-agent safety and alignment, and reinforcement learning theory.

Research experience

Mar 2025 - Jan 2026

Shanghai AILab

Research Intern, coadvised by Dr. Dongrui Liu and Dr. Jing Shao

  • Co-first author of RiOSWorld, the first risk assessment benchmark for multimodal computer-use agents, accepted by NeurIPS 2025.
  • Contributed to SafeWork-R1 on efficient reasoning RL post-training and joint safety-efficiency evaluation across large reasoning models.
  • First author of the Misevolution project, a systematic study of risks in self-evolving LLM agents, accepted by ICLR 2026.
  • Served as one of the student leads for AgentDoG, working on large-scale data synthesis, benchmark construction, and risk trajectory generation.

Sept 2025 - Present

Shanghai Innovation Institute

Visiting Student, advised by Prof. Weinan Zhang

  • Leading MonoScale, an expansion-aware update framework for scaling LLM-based multi-agent systems under sequential agent augmentation.
  • Proposed agent-conditioned task customization and natural-language memory updates to address router cold-start when adding new agents.
  • Formalized sequential augmentation as a contextual bandit and derived a monotonic non-decreasing performance guarantee via trust-region memory optimization.

Oct 2024 - Sept 2025

APEX Lab, SJTU

Research Intern, advised by Prof. Weinan Zhang

  • Contributed to AgentNet, a decentralized coordination framework for LLM-based multi-agent systems, accepted by NeurIPS 2025.
  • Designed a retrieval-based memory system for skill refinement and agent specialization across coding, math, and QA tasks.
  • Implemented self-evolving routing so agents can adapt connections and routing behavior without a central coordinator.

Mar 2024 - Oct 2024

IIOT Lab, SJTU

Research Intern, advised by Prof. Jiaxin Ding

  • Co-second author on Extreme Value Policy Optimization for Safe Reinforcement Learning, accepted by ICML 2025.
  • Developed an extreme value quantile optimization objective for rare, high-impact constraint violations.
  • Designed extreme-priority replay and proved upper bounds on constraint violations during policy updates.

Selected Publications

ICLR 2026

Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents

Shuai Shao*, Qihan Ren*†, Chen Qian, Boyi Wei, Dadi Guo, Yang JingYi, Xinhao Song, Linfeng Zhang, Weinan Zhang, Dongrui Liu, Jing Shao

ICML 2026

MonoScale: Scaling Multi-Agent System with Monotonic Improvement

Shuai Shao*†, Yixiang Liu*, Bingwei Lu, Weinan Zhang

NeurIPS 2025 Poster

RiOSWorld: Benchmarking the Risk of Multimodal Computer-Use Agents

Jingyi Yang*, Shuai Shao*, Dongrui Liu, Jing Shao

Technical Report

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45 Law

Shanghai AILab

Technical Report

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

Shanghai AILab

* indicates equal contribution. † indicates project lead.

Selected open-source projects

Agent Safety

AgentDoG

A diagnostic guardrail framework for AI agent safety and security, built around trajectory-level risk assessment, fine-grained diagnosis, and the ATBench dataset.

Research Workflow

iDeer

A multi-source information aggregation and scheduled briefing tool that tracks research updates across platforms, ranks them with LLMs, and delivers personalized summaries by email.

Community, talks, and background

Service

Reviewer and workshop service

  • ICML 2026 reviewer
  • NeurIPS 2026 reviewer
  • NeurIPS Responsible FM Workshop reviewer
  • AAAI LaMAS Workshop reviewer

Talks

Invited talk

  • Invited talk at NICE Community

Languages

Working across Chinese, English, and French

  • Chinese: Native
  • English: GRE Verbal 155, Quant 170, AW 3.5
  • French: DELF B2

Beyond research

Arts and athletics

  • Vice principal violist in the SJTU high-level art troupe
  • Main goalkeeper for the academy soccer team at SJTU

I am happy to talk about agent safety, multi-agent systems, and RL for LLMs.

The quickest way to reach me is by email. You can also find papers and updates on Google Scholar and GitHub.