Is anyone using models to describe an image and get a prompt? Is there much difference between Qwen 3.5 9b vs Qwen 3.5 27b, vs gemma 4 27b and another model you use ?

Obviously there's a difference, but it's still not entirely clear to me. Some models generate very detailed descriptions, but lose realism. I think that's the case with joycaption; I don't know exactly why this happens. With JoyCaption, there's a tendency to produce images that don't make much sense. ChatGPT descriptions produce coherent images, but they're less interesting. isn't always better. Some models, for reasons unknown, stimulate the "neurons" of specific image generators better. submitted by /u/More_Bid_2197 [link] [comments.