How do you tell users your AI agent is down?

r/artificial
Generative AI Data Science

Serious question. If you're running an agent in production (customer bot, coding assistant, data pipeline), what happens when it breaks at 3 AM? Traditional status pages track HTTP endpoints. They don't understand model providers, agent latency, reasoning loops, or context limits. "Partial outage" doesn't tell your users anything when the real problem is GPT-5.4 timing out or your RAG pipeline choking. I’m currently exploring letting agents self-manage its own status page. Haven't seen another status page do this and I’m hooked. I use it to monitor the agent.