Learning the Signature of Memorization in Autoregressive Language Models

ArXi:2604.03199v1 Announce Type: cross All prior membership inference attacks for fine-tuned language models use hand-crafted heuristics (e.g., loss thresholding, Min-K\%, reference calibration), each bounded by the designer's intuition. We