2x RTX Pro 6000 vs 2x A100 80GB dense model inference

Has anyone compared inference performance of the largest dense model (not sparse or MoE) that will fit on both of these setups to be compared? * On a PCIe Gen5 x16 bus, 2x RTX Pro 6000 Blackwell 96GB (workstation, not Max-Q): NVFP4 quantized * Triple NV-Link'd, 2x A100 80GB Ampere: W4A16 quantized submitted by /u/RealTime3392 [link] [comments]