Improved Batch Inference API: Enhanced UI, Expanded Model Support, and 3000× Rate Limit Increase

Together AI Blog
Generative AI

Our new Batch Inference API makes large-scale AI workloads simpler, faster, and cheaper. With a streamlined UI, universal model, and 3000× higher rate limits - now up to 30B tokens - you can process massive datasets at half the cost of real-time APIs.