Tree-of-Evidence: Efficient "System 2" Search for Faithful Multimodal Grounding

ArXi:2604.07692v2 Announce Type: replace Large Multimodal Models (LMMs) achieve state-of-the-art performance in high-stakes domains like healthcare, yet their reasoning remains opaque. Current interpretability methods, such as attention mechanisms or post-hoc saliency, often fail to faithfully represent the model's decision-making process, particularly when integrating heterogeneous modalities like time-series and text. We