Model,Avg Judge Score,ROUGE-1,ROUGE-2,ROUGE-L,BLEU,Samples w/ Eval,Samples w/ Caption | |
claude_4_sonnet,3.16,0.463,0.179,0.281,0.060,500,500 | |
cliptagger_12b,3.53,0.674,0.404,0.520,0.267,499,998 | |
gpt_4.1,3.64,0.581,0.260,0.376,0.119,494,500 | |
Model,Avg Judge Score,ROUGE-1,ROUGE-2,ROUGE-L,BLEU,Samples w/ Eval,Samples w/ Caption | |
claude_4_sonnet,3.16,0.463,0.179,0.281,0.060,500,500 | |
cliptagger_12b,3.53,0.674,0.404,0.520,0.267,499,998 | |
gpt_4.1,3.64,0.581,0.260,0.376,0.119,494,500 | |