Building a GraphRAG vs Traditional RAG Benchmarking System on Indian Public Health Literature

I'm building a benchmarking platform to rigorously compare three AI retrieval pipelines on a large corpus of Indian public health research papers from PubMed Central. Here's the architecture, the engineering decisions, and why I think graph-based retrieval is the right approach for this problem - before the benchmark numbers are in. The Problem in One Sentence Ask a RAG system: "How does diabetes affect TB treatment outcomes, and what role does HbA1c play?" Vector search returns chunks about diabetes. Chunks about TB. Maybe a chunk about HbA1c.