AI RESEARCH
SimulCost: A Cost-Aware Benchmark and Toolkit for Automating Physics Simulations with LLMs
arXiv CS.AI
•
ArXi:2603.20253v1 Announce Type: cross Evaluating LLM agents for scientific tasks has focused on token costs while ignoring tool-use costs like simulation time and experimental resources. As a result, metrics like pass become impractical under realistic budget constraints. To address this gap, we