AI RESEARCH

Product Evals in Three Simple Steps

Eugene Yan Blog

Label some data, align LLM-evaluators, and run the eval harness with each change.