PIKA: Expert-Level Synthetic Datasets for Post-Training Alignment from Scratch

ArXi:2510.06670v2 Announce Type: replace High-quality instruction data is critical for LLM alignment, yet existing open-source datasets often lack efficiency, requiring hundreds of thousands of examples to approach