Reasoning as Gradient: Scaling MLE Agents Beyond Tree Search

ArXi:2603.01692v2 Announce Type: replace-cross LLM-based agents for machine learning engineering (MLE) predominantly rely on tree search, a form of gradient-free optimization that uses scalar validation scores to rank candidates. As LLM reasoning capabilities improve, exhaustive enumeration becomes increasingly inefficient compared to directed updates, analogous to how accurate gradients enable efficient descent over random search. We