Unsupervised Corpus Poisoning Attacks in Continuous Space for Dense Retrieval

ArXi:2504.17884v2 Announce Type: replace-cross This paper concerns corpus poisoning attacks in dense information retrieval, where an adversary attempts to compromise the ranking performance of a search algorithm by injecting a small number of maliciously generated documents into the corpus. Our work addresses two limitations in the current literature. First, attacks that perform adversarial gradient-based word substitution search do so in the discrete lexical space, while retrieval itself happens in the continuous embedding space.