AI RESEARCH

Beyond Statistical Co-occurrence: Unlocking Intrinsic Semantics for Tabular Data Clustering

arXiv CS.AI

ArXi:2604.10865v1 Announce Type: new Deep Clustering (DC) has emerged as a powerful tool for tabular data analysis in real-world domains like finance and healthcare. However, most existing methods rely on data-level statistical co-occurrence to infer the latent metric space, often overlooking the intrinsic semantic knowledge encapsulated in feature names and values. As a result, semantically related concepts like `Flu' and `Cold' are often treated as symbolic tokens, causing conceptually related samples to be isolated.