Giving AI Agents Eyes (Part 2): From Page Snapshots to Interaction Traces

Dev.to AI
Generative AI

In Part 1, we solved the representation problem: how to give an LLM a compact, semantic view of a web page using accessibility trees. That gave our AI agent the ability to answer "what is on this page?" But users don't ask that.