Tree SAE: Learning Hierarchical Feature Structures in Sparse Autoencoders

ArXi:2605.07922v1 Announce Type: new Learning hierarchical features in Sparse Autoencoders (SAEs) is essential for capturing the structured nature of real-world data and mitigating issues like feature absorption or splitting. Existing works attempt to identify hierarchical relationships within independent feature sets by relying on activation coverage, the assumption that child feature should only activate when its parent feature activates.