AI RESEARCH

What Makes Good Instruction-Tuning Data? An In-Context Learning Perspective

arXiv CS.CL

ArXi:2604.25132v1 Announce Type: new Instruction-tuning datasets often contain substantial redundancy and low-quality samples, necessitating effective data selection methods. We propose an instruction data selection framework based on weighted in-context influence (wICI), which measures how effectively each candidate example reduces instruction-following difficulty for semantically related peers.