AI RESEARCH
RECIPE: Procedural Planning via Grounding in Instructional Video
arXiv CS.CV
•
ArXi:2605.19976v1 Announce Type: new Visual planning asks a model to generate the remaining steps of a procedure in natural language given a partial video context and a goal. Progress on this task is bottlenecked by annotation: clean labeled datasets are small, domain-narrow, and encode a single execution trajectory per example, even though many valid orderings exist.