Cerebras is running a trillion parameter model (Kimi K2.6) at 1000 tokens/s

r/singularity
AI Hardware

Link to tweet: Link to blog: submitted by /u/socoolandawesome [link] [comments]