DataDignity: Training Data Attribution for Large Language Models

ArXi:2605.05687v1 Announce Type: new Auditing language-model outputs often requires than judging correctness: an auditor may need to identify which source document most likely s the knowledge expressed in a response. We study this as pinpoint provenance: given a prompt, a target-model response, and a candidate corpus, rank the documents that best the response. We