AI RESEARCH

From Documents to Segments: A Contextual Reformulation for Topic Assignment

arXiv CS.CL

ArXi:2605.17714v1 Announce Type: new Traditional topic modeling assigns a single topic to each document. In practice, however, many real-world documents, such as product reviews or open-ended survey responses, contain multiple distinct topics. This mismatch often leads to topic contamination, where unrelated themes are merged into a single topic, making it difficult to identify documents that truly focus on a specific subject. We address this issue by