Correcting heterogeneous diagnostic bias when developing clinical prediction models using causal hidden Markov models

ArXi:2605.06059v1 Announce Type: cross In routine care, individuals identified a priori as high-risk are usually tested for conditions frequently. Protected attributes, such as sex or ethnicity may also determine testing frequency. Such heterogeneous detection rates across a population induce label error. This causes systematic model error for specific groups and biases performance metrics during validation. This paper proposes a method to correct for such bias in prediction models due to differential diagnostic delay.