AI RESEARCH

dnaHNet: A Scalable and Hierarchical Foundation Model for Genomic Sequence Learning

arXiv CS.LG

ArXi:2602.10603v3 Announce Type: replace Genomic foundation models have the potential to decode DNA syntax, yet face a fundamental tradeoff in their input representation. Standard fixed-vocabulary tokenizers fragment biologically meaningful motifs such as codons and regulatory elements, while nucleotide-level models preserve biological coherence but incur prohibitive computational costs for long contexts. We