Show HN: LLM agents that write Python to analyze execution traces at scale
Hacker News Show AI
•
Generative AI
We combined Stanford's ACE (agents learning from execution feedback) with the Reflective Language Model pattern. Instead of reading traces in a single pass, an LLM writes and runs Python in a sandbox to programmatically explore them - finding cross-trace patterns that single-pass analysis misses. The framework achieved 2x consistency improvement on τ2-bench. Comments URL: Points: 5 # Comments: 0