anyone got audio working in small gemma-4 models ???

Trying pipeline VAD speech chunk > LLM > TTS skipping ASR part completely but audio just refuses to work tried multiple llama.cpp builds and unsloth studio no luck so far only thing that works is LiteRT LM by google but it forces cpu only inference when audio is involved and it kills performance saw on Github that gpu implementation is still pending any workaround or different stack that actually works? submitted by /u/KokaOP [link] [comments]