Benchmarked Qwen3.5 (35B MoE, 27B Dense, 122B MoE) across Apple Silicon and AMD GPUs — ROCm vs Vulkan results were surprising, and context size matters

r/LocalLLaMA
AI Hardware

Benchmarked Qwen3.5 across Apple Silicon and AMD GPUs - ROCm vs Vulkan results were surprising I wanted to compare inference performance across my machines to decide whether keeping a new MacBook Pro was worth it alongside my GPU server. When I went looking for practical comparisons - real models, real workloads, Apple Silicon vs AMD GPUs, ROCm vs Vulkan - I couldn't find much beyond synthetic benchmarks or single-machine reviews. So I ran my own tests.