Can we have some more popular benchmarks

#8
by rombodawg - opened

Ive never even heard of the benchmarks you are using. Except for the first one (IFEval)

Here is a good list of benchmarks to use for your models.

0324_comparison.png

GLM-4-32B has better instruction following than qwq 32b, it's the strongest open source model I've ever used for instruction following, and the ability to write code is okay, it's a pretty good model

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment