AI RESEARCH

[D] Howcome Muon is only being used for Transformers?

r/MachineLearning

Muon has quickly been adopted in LLM