AI RESEARCH

FinTagging: Benchmarking LLMs for Extracting and Structuring Financial Information

arXiv CS.CL

ArXi:2505.20650v5 Announce Type: replace Accurate interpretation of numerical data in financial reports is critical for markets and regulators. Although XBRL (eXtensible Business Reporting Language) provides a standard for tagging financial figures, mapping thousands of facts to over 10k US GAAP concepts remains costly and error prone. Existing benchmarks oversimplify this task as flat, single step classification over small subsets of concepts, ignoring the hierarchical semantics of the taxonomy and the structured nature of financial documents.