Beyond Retrieval: A Multitask Benchmark and Model for Code Search

ArXi:2605.04615v1 Announce Type: cross Code search has usually been evaluated as first-stage retrieval, even though production systems rely on broader pipelines with reranking and developer-style queries. Existing benchmarks also suffer from data contamination, label noise, and degenerate binary relevance. In this paper, we