CCTU: A Benchmark for Tool Use under Complex Constraints

ArXi:2603.15309v1 Announce Type: cross Solving problems through tool use under explicit constraints constitutes a highly challenging yet unavoidable scenario for large language models (LLMs), requiring capabilities such as function calling, instruction following, and self-refinement. However, progress has been hindered by the absence of dedicated evaluations. To address this, we