AI RESEARCH
EmbBERT: Attention Under 2 MB Memory
arXiv CS.LG
•
ArXi:2502.10001v3 Announce Type: replace-cross Transformer architectures based on the attention mechanism have revolutionized natural language processing (NLP), driving major breakthroughs across virtually every NLP task. However, their substantial memory and computational requirements still hinder deployment on ultra-constrained devices such as wearables and Internet-of-Things (IoT) units, where available memory is limited to just a few megabytes. To address this challenge, we