Manish Shetty
I am a Researcher at
METR
measuring capabilities of frontier AI.
My PhD at UC Berkeley involved building
evals
and
environments
to elicit and measure AI capabilities on software engineering tasks. My work spanned tasks across the software lifecycle:
code completion
,
optimization
,
translation
, and
deployment
.
From 2020 to 2022, I was a research fellow at Microsoft Research.
Email
·
CV
·
Scholar
·
GitHub
·
Notes
·
𝕏
Papers
ICLR 2026
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces
Mike A. Merrill, Alexander G. Shaw, ..., Manish Shetty, ..., Ludwig Schmidt
paper
/
website
NeurIPS 2025
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents
Manish Shetty, Naman Jain, Jinjian Liu, Vijay Kethanaboyina, Koushik Sen, Ion Stoica
paper
/
website
/
code
/
blog
/
included in EpochAI's ECI
COLM 2025
R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents
Naman Jain*, Jaskirat Singh*, Manish Shetty, Liang Zheng, Koushik Sen, Ion Stoica
paper
/
website
ICML 2025
Challenges and Paths Towards AI for Software Engineering
Alex Gu, Naman Jain*, Wen-Ding Li*, Manish Shetty*, Yijia Shao, Ziyang Li, Diyi Yang, Kevin Ellis, Koushik Sen, Armando Solar-Lezama
paper
LLM4Code 2025
Syzygy: Dual Code-Test C to Rust Translation using LLMs and Dynamic Analysis
Manish Shetty*, Naman Jain*, Adwait Godbole*, Sanjit Seshia, Koushik Sen
paper
/
website
ICML 2024
R2E: Turning any GitHub Repository into a Programming Agent Environment
Manish Shetty*, Naman Jain*, Tianjun Zhang, King Han, Koushik Sen, Ion Stoica
paper
/
website
/
code
MLSys 2025
AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds
Yinfang Chen, Manish Shetty, Gagan Somashekar, Minghua Ma, Yogesh Simmhan, Jonathan Mace, Chetan Bansal, Rujia Wang, Saravan Rajmohan
paper
/
website
/
code
SoCC 2024
Building AI Agents for Autonomous Clouds: Challenges and Design Principles
Manish Shetty, Yinfang Chen, Gagan Somashekar, Minghua Ma, Yogesh Simmhan, Xuchao Zhang, Jonathan Mace, Dax Vandevoorde, Pedro Las-Casas, Shachee Mishra Gupta, Suman Nath, Chetan Bansal, Saravan Rajmohan
paper
/
featured on Microsoft Research Blog
See all papers →
Awards