Why is opencode so slow in processing the prompt with llama server?
r/LocalLLaMA
•
Generative AI
Open Source AI
I'm running opencode and llama-server locally. I have 32gb ram and 780m igpu. With Qwen3.6 I get around 21 t/s. Which should be decent but opencode just takes too long to process every input. What is it doing exactly? Tmux shows the available ram at the bottom (8+ GB available). Server startup command below the video.