Why is opencode so slow in processing the prompt with llama server?

r/LocalLLaMA
Generative AI Open Source AI

I'm running opencode and llama-server locally. I have 32gb ram and 780m igpu. With Qwen3.6 I get around 21 t/s. Which should be decent but opencode just takes too long to process every input. What is it doing exactly? Tmux shows the available ram at the bottom (8+ GB available). Server startup command below the video.