DSGym Offers a Reusable Container Based Substrate for Building and Benchmarking Data Science Agents

Data science agents should inspect datasets, design workflows, run code, and return verifiable answers, not just autocomplete Pandas code. DSGym, introduced by researchers from Stanford University, Together AI, Duke University, and Harvard University, is a framework that evaluates and trains…



