From AWS to Together Dedicated Endpoints: Arcee AI's journey to greater inference flexibility

Together AI Blog
AI Hardware

Arcee AI shifted its infrastructure from AWS to Together Dedicated Endpoints, slashing TTFT by 95%, hitting 41+ QPS throughput, and removing GPU overhead.