A few days ago I switched to Linux to try vLLM out of curiosity. Ended up creating a %100 local, parallel, multi-agent setup with Claude Code and gpt-oss-120b for concurrent vibecoding and orchestration with CC's agent Teams entirely offline. This video shows 4 agents collaborating.

r/LocalLLaMA
Generative AI AI Hardware AI Tools

This isn't a repo, its just how my Linux workstation is built. My setup was the following: vLLM Docker container - for easy deployment and parallel inference. Claude Code - vibecoding and Agent Teams orchestration. Points at vLLM localhost endpoint instead of a cloud provider. gpt-oss:120b - Coding agent. RTX Pro 6000 Blackwell MaxQ - GPU workhorse Dual-boot Ubuntu I never realized how much Windows was holding back my PC and agents after I switched to Linux. It was so empowering when I made the switch to a dual-boot Ubuntu and hopped on to vLLM.