AI RESEARCH

Drum Synthesis from Expressive Drum Grids via Neural Audio Codecs

arXiv CS.AI

ArXi:2605.10281v1 Announce Type: cross Generating realistic drum audio directly from symbolic representations is a challenging task at the intersection of music perception and machine learning. We propose a system that transforms an expressive drum grid, a time-aligned MIDI representation with microtiming and velocity information, into drum audio by predicting discrete codes of a neural audio codec. Our approach uses a Transformer-based model to map the drum grid input to a sequence of codec tokens, which are then converted to waveform audio via a pre-trained codec decoder.