Seeking Expert Comparison: Behemoth X v2 vs Behemoth ReduX v1

#1
by CosmossG - opened

Hi TheDrummer,

I hope you're doing well! I've been researching the latest large language models and noticed some interesting discrepancies between benchmark data and community feedback regarding two top performers in the UGI Writing category (less than 200 billion parameters), both being your models: Behemoth X v2 and Behemoth ReduX v1.

Both models appear on the UGI leaderboard with the following scores:

Model NatInt Standard Pop Culture World model Writing Release Date
Behemoth X v2 34.67 38.93 28.97 36.10 50.27 2025-08-21
Behemoth ReduX v1 33.72 38.86 33.45 28.87 48.76 2025-09-06

Based on these metrics, Behemoth X v2 appears superior across most categories except Popular Culture (the one I'm particularly most interested in, e.g. getting the model to know, understand and utilize internet slang, 'brainrot'/memes and game/movie references, like discussing about OST's). However, I'm seeing conflicting signals in community discussions:

  • Many testers praise ReduX v1 as a significant improvement over Behemoth X v1.2
  • Several highlighted comments (particularly 3 and 5 from the https://huggingface.co/TheDrummer/Behemoth-ReduX-123B-v1 description) broadly state that ReduX has "fixed every problem with X" - seemingly referring to the entire Behemoth X line

This creates a puzzle: ReduX v1 was released only ~2 weeks after Behemoth X v2, yet receives strong qualitative praise that seems to contradict the benchmark data.

My specific questions for you as an expert finetuner:

  1. How do you interpret this discrepancy between benchmark scores and community feedback?
  2. What are the practical strengths and weaknesses of each model?
  3. Could the timing of these releases (and potential rapid iteration cycles) explain why benchmarks might not capture the full picture of user experience (maybe the users didn't account for Behemoth X v2 in their testimonials because of the small timeframe)?

Thanks for your time and wisdom!

This comment has been hidden (marked as Off-Topic)
CosmossG changed discussion status to closed
CosmossG changed discussion status to open

Sign up or log in to comment