You can do CUDA inference on an Apple Silicon Mac with PCI Passthrough

r/LocalLLaMA
AI Hardware

I have been working on a project to adapt QEMU, running on macOS, to passing through a GPU into a Linux VM. I wrote this post walking through some of the interesting challenges there, along with benchmarks. The post focuses a lot on gaming, but there are AI benchmarks there as well. submitted by /u/scottjgo [link] [comments]