Got a 9B Abliterated Claude-Distilled model running for my local hermes

r/LocalLLaMA
Generative AI AI Hardware AI Tools

My laptop only has 6GB of VRAM, which wasn't enough to run abliterated model for my local AI. I managed to completely offload the inference to a free Google Colab T4 GPU and route the API straight back to my local CLI terminal using a Cloudflare tunnel. spent 0$ so far. for a test. submitted by /u/DjuricX [link] [comments]