Running Qwen3.5-27B locally as the primary model in OpenCode

r/LocalLLaMA
Generative AI AI Hardware Open Source AI

This weekend I wanted to test how well a local LLM can work as the primary model for an agentic coding assistant like OpenCode or OpenAI Codex. I picked Qwen3.5-27B, a hybrid architecture model that has been getting a lot of attention lately for its performance relative to its size, set it up locally and ran it with OpenCode to see how far it could go. I set it up on my NVIDIA RTX4090 (24GB) workstation running the model via llama.cpp and using it with OpenCode running on my macbook (connection via Tailscale.