Introducing AutoJudge: Streamlined inference acceleration via automated dataset curation
Together AI Blog
•
Machine Learning
Generative AI
AutoJudge accelerates LLM inference by identifying which token mismatches actually matter. Using self-supervised learning to train a lightweight classifier, it accepts up to 40 draft tokens per cycle - delivering 1.5-2× speedups over standard speculative decoding with minimal accur