view article Article Selene 1 Mini: the best small language model-as-a-judge By AtlaAI and 10 others โข Jan 29 โข 12
view article Article Judge Arena: Benchmarking LLMs as Evaluators By kaikaidai and 7 others โข Nov 19, 2024 โข 57
view article Article Experimenting with different training objectives for an AI evaluator By kaikaidai and 1 other โข Oct 31, 2024 โข 2