AI RESEARCH

TimberAgent: Gram-Guided Retrieval for Executable Music Effect Control

arXiv CS.AI

ArXi:2603.09332v1 Announce Type: cross Digital audio workstations expose rich effect chains, yet a semantic gap remains between perceptual user intent and low-level signal-processing parameters. We study retrieval-grounded audio effect control, where the output is an editable plugin configuration rather than a finalized waveform. Our focus is Texture Resonance Retrieval (TRR), an audio representation built from Gram matrices of projected mid-level Wav2Vec2 activations. This design preserves texture-relevant co-activation structure.