Post
431
I just stress-tested the Beast: MiniMax-M2.1 on Z8 Fury G5.
2101 tokens/sec. FORTY concurrent clients. That's 609 t/s out, 1492 t/s in. The model outputs fire faster than I can type, but feeds on data like a black hole on cheat day.
But wait, there's more! Threw it into Claude Code torture testing with 60+ tools, 8 agents (7 sub-agents because apparently one wasn't enough chaos). It didn't even flinch. Extremely fast, scary good at coding. The kind of performance that makes you wonder if the model's been secretly reading Stack Overflow in its spare time lol
3 months ago, these numbers lived in my "maybe in “2030 dreams. Today it's running on my desk AND heaths my home office during the winter!
2101 tokens/sec. FORTY concurrent clients. That's 609 t/s out, 1492 t/s in. The model outputs fire faster than I can type, but feeds on data like a black hole on cheat day.
But wait, there's more! Threw it into Claude Code torture testing with 60+ tools, 8 agents (7 sub-agents because apparently one wasn't enough chaos). It didn't even flinch. Extremely fast, scary good at coding. The kind of performance that makes you wonder if the model's been secretly reading Stack Overflow in its spare time lol
3 months ago, these numbers lived in my "maybe in “2030 dreams. Today it's running on my desk AND heaths my home office during the winter!