A Probabilistic Consensus-Driven Approach for Robust Counterfactual Explanations

ArXi:2604.17494v1 Announce Type: new Counterfactual explanations (CFEs) are essential for interpreting black-box models, yet they often become invalid when models are slightly changed. Existing methods for generating robust CFEs are often limited to specific types of models, require costly tuning, or inflexible robustness controls. We propose a novel approach that jointly models the data distribution and the space of plausible model decisions to ensure robustness to model changes.