FinRule-Bench: A Benchmark for Joint Reasoning over Financial Tables and Principles

ArXi:2603.11339v1 Announce Type: new Large language models (LLMs) are increasingly applied to financial analysis, yet their ability to audit structured financial statements under explicit accounting principles remains poorly explored. Existing benchmarks primarily evaluate question answering, numerical reasoning, or anomaly detection on synthetically corrupted data, making it unclear whether models can reliably verify or localize rule compliance on correct financial statements. We