Estimating Item Difficulty with Large Language Models as Experts

ArXi:2605.18562v1 Announce Type: cross Accurate estimates of item difficulty are essential for valid assessment and effective adaptive learning. However, for newly created tasks, response data are typically unavailable. Pretesting and expert judgement can be costly and slow, while machine learning methods often require large labelled