Arc Sentry outperformed LLM Guard 92% vs 70% detection on a head to head benchmark. Here is how it works.
r/artificial
•
Generative AI
Open Source AI
AI Research
I built Arc Sentry, a pre-generation prompt injection detector for open-weight LLMs. Instead of scanning text for patterns after the fact, it reads the model’s internal residual stream before generate is called and blocks requests that destabilize the model’s information geometry. Head to head benchmark on a 130-prompt SaaS deployment dataset: Arc Sentry: 92% detection, 0% false positives LLM Guard: 70% detection, 3.3% false positives The difference is architectural. LLM Guard classifies input text. Arc Sentry measures whether the model itself is being pushed into an unstable regime.