AI RESEARCH

Who Decides What Is Harmful? Content Moderation Policy Through A Multi-Agent Personalised Inference Framework

arXiv CS.CL

ArXi:2605.01416v1 Announce Type: cross The increasing scale and complexity of online platforms raises critical policy questions around harmful content, digital well-being, and user autonomy. Traditional content moderation systems rely on centralised, top-down rules, often failing to accommodate the subjective nature of harm perception. This paper proposes an LLM-based multi-agent personalised inference framework that filters content based on unique sensitivity profiles of individual users.