Issues with Gemma 4 tool calling - abrupt gen ending despite the model telling me it wants to do X.
r/LocalLLaMA
•
Open Source AI
Hello, I have noticed an annoying issue with Gemma 4 26b a4b. It seems like it cannot do multiple think->tool call->think->tool call turns. It can do multiple tool calls in one generation but when thinking inbetween that steps happens, it always say it is wanting to do X and then just ends the generation immediately. I am using a26b a4b q4_k_m with the latest chat template, interleaved or not, the old one, it doesn't make a difference. Does anyone else have this issue? Edit: thinking->tool call -> thinking -> tool call -> response to the user works.