AI RESEARCH

Large Language Models are Powerful Electronic Health Record Encoders

arXiv CS.AI

ArXi:2502.17403v5 Announce Type: replace-cross Electronic Health Records (EHRs) offer considerable potential for clinical prediction, but their complexity and heterogeneity challenge traditional machine learning. Domain-specific EHR foundation models trained on unlabeled EHR data have shown improved predictive accuracy and generalization. However, their development is constrained by limited data access and site-specific vocabularies.