ggml: add graph_reused by am17an · Pull Request #21764 · ggml-org/llama.cpp

r/LocalLLaMA
Generative AI AI Hardware Open Source AI

CUDA speedup submitted by /u/jacek2023 [link] [comments]