REALM: Reliable Expertise-Aware Language Model Fine-Tuning from Noisy Annotations

ArXi:2604.17289v1 Announce Type: new Supervised fine-tuning of large language models relies on human-annotated data, yet annotation pipelines routinely involve multiple crowdworkers of heterogeneous expertise. Standard practice aggregates labels via majority vote or simple averaging, discarding annotator identity and causing the model to absorb the errors of unreliable annotators directly into its parameters.