[Release] Apex-1: A 350M Tiny-LLM trained locally on an RTX 5060 Ti 16GB

r/LocalLLaMA
Machine Learning Generative AI NLP Open Source AI AI Research

Hey everyone! I wanted to share my latest project: Apex-1, a lightweight 350M parameter model designed for speed and efficiency on edge devices. The Goal: I wanted to see how much "world knowledge" and instruction-following I could cram into a tiny model using consumer hardware and high-quality data. Key Info: Architecture: Based on nanoGPT / Transformer. Dataset: Pre-trained on a subset of FineWeb-Edu (10BT) for reasoning and knowledge. Finetuning: Alpaca-Cleaned for better instruction following. Format: Weights available as ONNX (perfect for mobile/web) and standard PyTorch.