AI RESEARCH

NorBERTo: A ModernBERT Model Trained for Portuguese with 331 Billion Tokens Corpus

arXiv CS.AI

ArXi:2605.00086v1 Announce Type: cross High-quality corpora are essential for advancing Natural Language Processing (NLP) in Portuguese. Building on previous encoder-only models such as BERTimbau and Albertina PT-BR, we