Budget-Aware Anytime Reasoning with LLM-Synthesized Preference Data

ArXi:2601.11038v2 Announce Type: replace We study the reasoning behavior of large language models (LLMs) under limited computation budgets. In such settings, producing useful partial solutions quickly is often practical than exhaustive reasoning, which incurs high inference costs. Many real-world tasks, such as trip planning, require models to deliver the best possible output within a fixed reasoning budget. We