AI RESEARCH
Production AI very different from the demos [D]
r/MachineLearning
•
Moved an AI feature into production a few months ago and the cost profile has been a constant surprise since so the s and the early prototypes ran cheap because the volume was tiny + the prompts were short but when it hit traffic the token usage scaled a lot. I think it was partly because customers ask longer and unclear questions than our test set because we ended up adding context retrieval that doubled the input length on every call.