ByteDance-Seed/cryofm-v2
Updated
•
3
None defined yet.
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
LLM Swiss Round: Aggregating Multi-Benchmark Performance via Competitive Swiss-System Dynamics