GPQA-Diamond and SuperGPQA benchmarks

#1
by SkyMind - opened

Please add GPQA-Diamond and SuperGPQA to the benchmark comparisons for this. Phi-4 punches above its weight so this would be informative.

Thanks!

Sign up or log in to comment