Yeti: A compact protein structure tokenizer for reconstruction and multi-modal generation

ArXi:2605.09981v1 Announce Type: cross Multimodal models that jointly reason over protein sequences, structures, and function annotations within a unified representation hold immense potential for integrating multimodal data and generating new proteins with designed functional properties. To utilize transformer architectures, such models require a tokenizer that converts protein structure from continuous atomic coordinates into discrete representations suitable for scalable multimodal.