Zero-Shot Faithful Textual Explanations via Directional-Derivative Influence on Predictions

ArXi:2605.16877v1 Announce Type: new Zero-shot textual explanations aim to make image classifiers transparent by probing their internal representations, without relying on task-specific supervision or LVLMs. However, existing methods often miss the features that truly drive the prediction, resulting in limited \textit{faithfulness} to the evidence underlying the model's decision. To address this, we propose FaithTrace.