AI RESEARCH
Topic Is Not Agenda: A Citation-Community Audit of Text Embeddings
arXiv CS.LG
•
ArXi:2605.07158v1 Announce Type: cross Vector search and retrieval-augmented generation (RAG) rest on the assumption that cosine similarity between text embeddings reflects conceptual relatedness. We measure where this assumption breaks. We build an augmented citation graph over 3.58M scientific papers and partition it via Leiden CPM at two granularities: sub-field (L1) and research-agenda (L2, hierarchical inside each L1