Using custom kernels has never been easier!

r/StableDiffusion
AI Tools

Almost all of us have struggled when building powerful kernels, including Flash Attention 3, Sage Attention, and countless others! What if we could load the prebuilt kernel binaries for a ed hardware and get started right off the bat? No need to worry about rebuilding the kernels when a PyTorch version update is done! Below is an example of how you would use Flash Attention 3: ```py # make sure `kernels` is installed: `pip install -U kernels` from kernels import get_kernel kernel_module = get_kernel("kernels-community/flash-attn3") # <- change the ID if needed flash_attn_combine.