6-GPU multiplexer from K80s ‚ hot-swap between models in 0.3ms

r/LocalLLaMA
AI Hardware

So after working on boot AI I had purchased some old bitcoin mining hardware to see if I could run old nvidia card on them. So I built a system that multiplexes 6 GPU dies through a single PCIe slot using a custom Linux kernel module. Switch between loaded models in under a millisecond.