Last Week in Multimodal AI - Local Edition

r/LocalLLaMA
AI Hardware Open Source AI AI Tools

I curate a weekly multimodal AI roundup, here are the local/open-source highlights from last week: LTX-2.3 - Lightricks Better prompt following, native portrait mode up to 1080x1920. Community already built GGUF workflows, a desktop app, and a Linux port within days of release. Model | HuggingFace Helios - PKU-YuanGroup 14B video model running real-time on a single GPU. s t2, i2, and v2 up to a minute long. Numbers seem too good, worth testing yourself. HuggingFace | GitHub Kiwi-Edit Text or image prompt video editing with temporal consistency.