AI RESEARCH
SODIUM: From Open Web Data to Queryable Databases
arXiv CS.AI
•
ArXi:2603.18447v1 Announce Type: cross During research, domain experts often ask analytical questions whose answers require integrating data from a wide range of web sources. Thus, they must spend substantial effort searching, extracting, and organizing raw data before analysis can begin. We formalize this process as the SODIUM task, where we conceptualize open domains such as the web as latent databases that must be systematically instantiated to downstream querying.