AI RESEARCH
A framework for analyzing concept representations in neural models
arXiv CS.LG
•
ArXi:2605.01381v1 Announce Type: cross Understanding how neural models represent human-interpretable concepts is challenging. Prior work has explored linear concept subspaces from diverse perspectives, such as probing and concept erasure. We