Gemma 4 is excellent for image to prompt
r/StableDiffusion
•
Open Source AI
I used Qwen 3 8b VL for a long time for image to prompt but now that I have tried Gemma4 26b I am delighted with how much detail can be extracted from the image, and how much it can improve the prompt. I've also tried larger Qwen3 models but they can't even approach the Gemma models. From the LM studio, I start Gemma, give him a picture and make a prompt of it just and structure according to the image model that I use mostly Zit sometimes Flux, ERNIE-Image I haven't tried yet, but I don't see a reason why I wouldn't have great results on it. submitted by /u/Arrow2304 [link] [comments.