LTX 2.3 can generate some really decent singing and music too
r/StableDiffusion
•
Generative AI
Messing around with the new LTX 2.3 model using this i2 workflow, and I'm actually surprised by how much better the audio is. It's almost as capable as Suno 3-4 in terms of singing and vocals. For actual beats or instrumentation, I'd say it's not quite there - the drums and bass sound a bit hollow and artificial, but still a huge leap from 2.0. I've used the LTXGemmaEnhancePrompt node, which really seems to help with results: "A medium shot captures a female indie folk singer, her eyes closed and mouth slightly open, singing into a vintage-style microphone.