using all 31 free NVIDIA NIM models at once with automatic routing and failover

r/LocalLLaMA
AI Hardware

Been using nvidia NIM free tier for a while and the main annoyance is picking which model to hit and dealing with rate limits (~40 RPM per model