[D] Released a 100k-sample dataset on Hugging Face
r/LocalLLaMA
•
Machine Learning
Generative AI
AI Research
AI Tools
We’ve released a 100,000-sample Chain-of-Thought (CoT) dataset for fine-tuning local reasoning models. Each sample includes explicit intermediate reasoning traces, rather than answer-only supervision. The goal is to improve reasoning consistency during supervised fine-tuning, especially for smaller local models. We’re sharing it here to gather feedback from people working on local LLM fine-tuning and reasoning distillation.