Wrote up why vector RAG keeps failing on complex documents and found a project doing retrieval without embeddings at all
r/LocalLLaMA
•
Generative AI
Been building RAG pipelines for a while and kept hitting the same wall: the retrieval works fine on simple documents but falls apart the moment you throw a dense financial report, legal filing, or technical manual at it. Spent some time digging into why and it basically comes down to one thing - similarity is not the same as relevance. The chunk that scores highest cosine similarity to your query is often not the chunk that actually answers it.