Qwen 3.6 27B Q8 on four Nvidia RTX A4000 (16GB each) with Llama.cpp and MTP enabled

Qwen 3.6 27B Q8 on four Nvidia RTX A4000 (16GB each) with Llama.cpp and MTP enabled My setup is heterogenous, I originally acquired my server (Lenovo ThinkStation P3 Tower Gen 2) to run OpenShift/K8s clusters (because I work on that), and later on I started purchasing one by one those cards Nvidia RTX A4000 with 16GB of VRAM each, yes, old technology, but hear me out, 140W each card, one PCIe slot per card. I can accommodate four cards on my server.