PixelClaw: an LLM agent for image manipulation
r/artificial
•
Generative AI
Computer Vision
Robotics
I'm making an LLM agent specialized for image processing. It combines: an LLM for conversation, planning, and tool use (s a variety of LLMs) image generation/AI-based editing via gpt-image background removal via rembg (several specialized models available) pixelization using pyxelate posterization and defringing using custom algorithms speech-to-text (Whisper) and text-to-speech (Kokoro plus HALO ) a nice UI based on Raylib, including file drag-and-drop PixelClaw is free and open-source at. You can find videos there too.