Improved Batch Inference API: Enhanced UI, Expanded Model Support, and 3000× Rate Limit Increase
Together AI Blog
•
Generative AI
Our new Batch Inference API makes large-scale AI workloads simpler, faster, and cheaper. With a streamlined UI, universal model, and 3000× higher rate limits - now up to 30B tokens - you can process massive datasets at half the cost of real-time APIs.