AI RESEARCH
Rethinking Weight Tying: Pseudo-Inverse Tying for LM Stable Training and Updates
arXiv CS.LG
•
ArXi:2602.04556v2 Announce Type: replace-cross Weight tying is widely used in compact language models to reduce parameters by sharing the token table between the input embedding and the output projection. However, parameter sharing alone does not guarantee a stable token interface: during