Flux 2 Klein 9B is now up to 2× faster with multiple reference images (new model)

Under the hood: KV-caching lets the model skip redundant computation on your reference images. The references you use, the bigger the speedup. Inference is up to 2x+ faster for multi-reference editing. We're also releasing FP8 quantized weights, built with NVIDIA. submitted by /u/meknidirta [link] [comments]