Explainable Disentangled Representation Learning for Generalizable Authorship Attribution in the Era of Generative AI

ArXi:2604.21300v1 Announce Type: cross Learning robust representations of authorial style is crucial for authorship attribution and AI-generated text detection. However, existing methods often struggle with content-style entanglement, where models learn spurious correlations between authors' writing styles and topics, leading to poor generalization across domains. To address this challenge, we propose Explainable Authorship Variational Autoencoder (EAVAE), a novel framework that explicitly disentangles style from content through architectural separation-by-design.