Profile photo of Manish Shetty
I am a Researcher at METR measuring capabilities of frontier AI.
My PhD at UC Berkeley involved building evals and environments to elicit and measure AI capabilities on software engineering tasks. My work spanned tasks across the software lifecycle: code completion, optimization, translation, and deployment.
From 2020 to 2022, I was a research fellow at Microsoft Research.
Email · CV · Scholar · GitHub · Notes · 𝕏

Research

Dissertation
Scaling Environments and Verifiers for Software Engineering Agents
Manish Shetty · PhD, UC Berkeley · May 2026
ICLR 2026
Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces
Mike A. Merrill, Alexander G. Shaw, ..., Manish Shetty, ..., Ludwig Schmidt
NeurIPS 2025
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents
Manish Shetty, Naman Jain, Jinjian Liu, Vijay Kethanaboyina, Koushik Sen, Ion Stoica
COLM 2025
R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents
Naman Jain*, Jaskirat Singh*, Manish Shetty, Liang Zheng, Koushik Sen, Ion Stoica
ICML 2025
Challenges and Paths Towards AI for Software Engineering
Alex Gu, Naman Jain*, Wen-Ding Li*, Manish Shetty*, Yijia Shao, Ziyang Li, Diyi Yang, Kevin Ellis, Koushik Sen, Armando Solar-Lezama
LLM4Code 2025
Syzygy: Dual Code-Test C to Rust Translation using LLMs and Dynamic Analysis
Manish Shetty*, Naman Jain*, Adwait Godbole*, Sanjit Seshia, Koushik Sen
ICML 2024
R2E: Turning any GitHub Repository into a Programming Agent Environment
Manish Shetty*, Naman Jain*, Tianjun Zhang, King Han, Koushik Sen, Ion Stoica

See all research →

Awards