AI RESEARCH
BMdataset: A Musicologically Curated LilyPond Dataset
arXiv CS.CL
•
ArXi:2604.10628v1 Announce Type: cross Symbolic music research has relied almost exclusively on MIDI-based datasets; text-based engraving formats such as LilyPond remain unexplored for music understanding. We present BMdataset, a musicologically curated dataset of 393 LilyPond scores (2,646 movements) transcribed by experts directly from original Baroque manuscripts, with metadata covering composer, musical form, instrumentation, and sectional attributes. Building on this resource, we