AI RESEARCH

EmbGen: Teaching with Reassembled Corpora

arXiv CS.AI

ArXi:2605.19394v1 Announce Type: cross Adapting small instruction-tuned models to specialized domains often relies on supervised fine-tuning (SFT) on curated instruction-response examples, which is expensive to collect at scale. Synthetic