Tracking Adaptation Time: Metrics for Temporal Distribution Shift

ArXi:2604.07266v1 Announce Type: new Evaluating robustness under temporal distribution shift remains an open challenge. Existing metrics quantify the average decline in performance, but fail to capture how models adapt to evolving data. As a result, temporal degradation is often misinterpreted: when accuracy declines, it is unclear whether the model is failing to adapt or whether the data itself has become inherently challenging to learn. In this work, we propose three complementary metrics to distinguish adaptation from intrinsic difficulty in the data.