DiffusionLLM - Inception Mercury 2 - 11,000 tokens per second on NVIDIA H100 GPUs.
r/LocalLLaMA
•
AI Hardware
Submitted by /u/Revolutionary_Ask154 [link] [comments]