Best RTX Pro 6000 vllm settings?

r/LocalLLaMA
AI Hardware AI Tools

Just got myself (for my company) a RTX Pro 6000 Blackwell Workstation card. Managed to get really good TPS on qwen3 27b fp8. Using it for many agents that specialize on one specific task at a time. Trying to get the best possible Speed + Concurrency running on vllm 0.20.1 nightly cuda 13.1.