Show HN: I over-engineered a home security camera that uses an LLM and talks

Hacker News Show AI
Generative AI

Roz is an open-source, Python-based pipeline that captures a webcam feed, detects motion, sends the frames to a local Vision LLM to analyze the scene, and then uses text-to-speech to audibly announce any meaningful changes. The Backstory: I heard an ad for Google’s Home Premium Advanced service, which claims to analyze your Nest doorbell images and describe what it sees. I thought that sounded cool, but I didn't want to pay $20/month for it or send my camera feeds to the cloud. I wanted to see if I could build a localized, subscription-free version myself.