3º. Entity extraction with a 2B model: benchmarks from a personal knowledge graph

When you're building a personal knowledge graph - the kind that automatically discovers that "Ana García" appears in your emails, your calendar, and tomorrow's meeting notes - you need entity extraction. The industry answer is to throw GPT-4 at it and move on. But when your system runs on a mini-PC in someone's living room, you need something that fits in 2GB of RAM. We benchmarked qwen3-vl:2b-instruct-q4_K_M - a 2-billion parameter multimodal model, quantized to 4-bit - running locally through Ollama. The same model that describes our photos also extracts entities from text.