Intrinsic Vicarious Conditioning for Deep Reinforcement Learning

ArXi:2605.12224v1 Announce Type: new Advancements in reinforcement learning have produced a variety of complex and useful intrinsic driving forces; crucially, these drivers operate under a direct conditioning paradigm. This form of conditioning limits our agents' capacity by restricting how they learn from the environment as well as from others. Off-policy or learn-by-example methods can learn from nstrators' representations, but they require access to the nstrating agent's policies or their reward functions. Our work overcomes this direct sampling limitation by.