AI RESEARCH
Leakage and Interpretability in Concept-Based Models
arXiv CS.AI
•
ArXi:2504.14094v3 Announce Type: replace-cross Concept-based Models aim to improve interpretability by predicting high-level intermediate concepts, representing a promising approach for deployment in high-risk scenarios. However, they are known to suffer from information leakage, whereby models exploit unintended information encoded within the learned concepts. We