Pipevals: Evaluation pipelines for every LLM application

Lobste.rs AI
Generative AI AI Research

Systematically benchmark, evaluate, and monitor AI systems with visual evaluation pipelines.