How many of you tried BeeLlama.cpp? How's it? Agentic coding possible with 8GB VRAM?

r/LocalLLaMA
Generative AI Open Source AI

We'll be getting those features(check bottom link) on mainline soon or later anyway. But for now this fork could be useful to see the full potential of our poor GPUs(and also big, large GPUs). Any 8GB VRAM(and 32GB RAM) folks already doing Agentic coding with models(@ Q4 at least) like Qwen3.6-35B-A3B, Qwen3.6-27B, Gemma-4-31B, Gemma-4-26B-A4B? I would love to see some t/s stats, full commands & details on that. I'm not expecting any miracle with 8GB VRAM, still want to do something decent with limited constraints.