SWE-Bench Performance Reaches 50.8% Without Tool Use: A Case for Monolithic State-in-Context Agents

Recent advancements in LM agents have shown promising potential for automating intricate real-world tasks. These agents typically operate by proposing and executing actions through APIs, supporting applications such as software engineering, robotics, and scientific experimentation. As these tasks become more…





