AI RESEARCH

Do Large Language Models Plan Answer Positions? Position Bias in Multiple-Choice Question Generation

arXiv CS.CL

ArXi:2605.01846v1 Announce Type: new Large language models (LLMs) are increasingly used to generate multiple-choice questions (MCQs), where correct answers should ideally be uniformly distributed across options. However, we observe that LLMs exhibit systematic position biases during generation. Through extensive experiments with 10 LLMs and 5 vision-language models (VLMs) on three MCQ generation tasks, we show that these biases are structured, with similar patterns emerging within model families.