Small models (Qwen 3.5 0.8B, Llama 3.2 1B, Gemma 3 1B) stuck in repetitive loops

r/LocalLLaMA
Generative AI Open Source AI

I'm working with small models (~1B parameters) and frequently encounter issues where the output gets stuck in loops, repeatedly generating the same sentences or phrases. This happens especially consistent when temperature is set low (e.g., 0.1-0.3). What I've tried: Increasing temperature above 1.0 - helps somewhat but doesn't fully solve the issue Setting repetition_penalty and other penalty parameters Adjusting top_p and top_k Larger models from the same families (e.g., 3B+) don't exhibit this problem.