Compact LLM Battle Arena: Frugal AI Face-Off!
Evaluate LLM reasoning capabilities through the Candle Test