Why Local LLM JSON Output Breaks — Failure Patterns and How to Fix Them in Code
Dev.to AI
•
Generative AI
Open Source AI
API Gets One Line. Local Gets a Minefield. Claude has equivalent. Set it, and output is guaranteed JSON. Parse errors don't happen. Local LLMs don't have this. llama.cpp offers --grammar to constrain output to valid JSON syntax, but that only forces the format to be JSON. Whether the content makes sense is a completely different problem. // API output: as intended { "name": "Qwen2.5-14B", "speed_tps": 31.5, "vram_gb": 7.3 } // Local LLM (grammar enabled): valid JSON, broken content { "name": "Qwen2.5-14B", "speed_tps": "fast", "vram_gb": "enough" } // → Types are wrong.