Dynamics-Aligned Shared Hypernetworks for Contextual RL under Discontinuous Shifts

ArXi:2602.06550v2 Announce Type: replace-cross Zero-shot generalization in contextual reinforcement learning remains a core challenge, particularly when the context is latent and must be inferred from data. A canonical failure mode arises when latent context discontinuously changes how actions affect the environment, requiring incompatible control responses across contexts. We propose DMA*-SH, a framework where a single hypernetwork, trained solely via dynamics prediction, generates a small set of adapter weights shared across the dynamics model, policy, and action-value function.