Seeing the activity pop up big time in this sub due to various open models. Most of them require at least 16gb vram. What can I do with 8?

Not deeply technically fluent but have ran few models locally before, around the time before gemma 4 dropped. I tried some low quant of qwen 2.5 coder and after some tinkering I got it to run but it was just so slow, obviously. it seems in the meantime lots have changed and there might be something useful? Looking at either coding (some quant of qwen 3.6 27b maybe?) or image understanding/data extraction. Tested the 3.6 27b on checkbox extraction for a work tool and it worked pretty great on my runpod instance.