Extracted MTP tensor GGUFs - smaller donor models for grafting.
r/LocalLLaMA
•
Open Source AI
The script to graft MTP tensors requires a full GGUF model file. I felt that was a bit hefty, so I asked local Gemma to write something to just extract what's required. The results are two faux GGUFs weighing in at just 900MB ( 35A3B ) and 450MB ( 27B ), containing only the tensors and fully compatible with the script. A lot quicker to download compared to the original 38GB and 29GB models for those who just want to convert their existing library or save some bandwidth.