AI RESEARCH
[D] Howcome Muon is only being used for Transformers?
r/MachineLearning
•
Muon has quickly been adopted in LLM