AI RESEARCH

Generative Adversarial Post-Training Mitigates Reward Hacking in Live Human-AI Music Interaction

arXiv CS.LG

ArXi:2511.17879v4 Announce Type: replace Most applications of generative AI involve a sequential interaction in which a person inputs a prompt and waits for a response, and where reaction time and adaptivity are not important factors. In contrast, live jamming is a collaborative interaction that requires real-time coordination and adaptation without access to the other player's future moves, while preserving diversity to sustain a creative flow. Reinforcement learning post-