model: support step3-vl-10b by forforever73 · Pull Request #21287 · ggml-org/llama.cpp
r/LocalLLaMA
•
Generative AI
Open Source AI
STEP3-VL-10B is a lightweight open-source foundation model designed to redefine the trade-off between compact efficiency and frontier-level multimodal intelligence. Despite its compact 10B parameter footprint, STEP3-VL-10B excels in visual perception, complex reasoning, and human-centric alignment. It consistently outperforms models under the 10B scale and rivals or surpasses significantly larger open-weights models ( 10×-20× its size ), such as GLM-4.6V (106B-A12B), Qwen3-VL-Thinking (235B-A22B), and top-tier.