AI RESEARCH

CLiGNet: Clinical Label-Interaction Graph Network for Medical Specialty Classification from Clinical Transcriptions

arXiv CS.AI

ArXi:2603.22752v1 Announce Type: new Automated classification of clinical transcriptions into medical specialties is essential for routing, coding, and clinical decision, yet prior work on the widely used MTSamples benchmark suffers from severe data leakage caused by applying SMOTE oversampling before train test splitting. We first document this methodological flaw and establish a leakage free benchmark across 40 medical specialties (4966 records), revealing that the true task difficulty is substantially higher than previously reported.