Grok 4.3 underperforms Grok 4.20 0309 on the Extended NYT Connections Benchmark, dropping from 93.4 to 67.5, though it achieves this result at a lower cost than the earlier Grok 4.20 run
r/singularity
•
AI Research
Info: submitted by /u/zero0_one1 [link] [comments]