Sam Heutmaker commited on
Commit
4ffc59c
·
1 Parent(s): fb437aa

fix graphs

Browse files
Files changed (1) hide show
  1. README.md +9 -10
README.md CHANGED
@@ -51,13 +51,14 @@ Performance metrics on our internal evaluation set:
51
 
52
  ### Benchmark Visualizations
53
 
54
- <div align="center">
55
- <img src="./assets/judge-score.png" alt="Average Judge Score Comparison" width="45%" />
56
- <img src="./assets/rouge-1.png" alt="ROUGE-1 Score Comparison" width="45%" />
57
- <br/>
58
- <img src="./assets/rouge-L.png" alt="ROUGE-L Score Comparison" width="45%" />
59
- <img src="./assets/bleu.png" alt="BLEU Score Comparison" width="45%" />
60
- </div>
 
61
 
62
  FP8 quantization showed no measurable quality degradation compared to bf16 precision.
63
 
@@ -75,9 +76,7 @@ GrassData/ClipTagger-12b delivers frontier-quality performance at a fraction of
75
 
76
  *Cost calculations based on 700 input tokens and 250 output tokens per generation.
77
 
78
- <div align="center">
79
- <img src="./assets/cost.png" alt="Cost Comparison Per 1 Million Generations" width="80%" />
80
- </div>
81
 
82
  ClipTagger-12b offers **15x cost savings** compared to GPT-4.1 and **17x cost savings** compared to Claude 4 Sonnet, while maintaining comparable quality metrics.
83
 
 
51
 
52
  ### Benchmark Visualizations
53
 
54
+ <p align="center">
55
+ <img src="./assets/judge-score.png" alt="Average Judge Score Comparison" width="49%" />
56
+ <img src="./assets/rouge-1.png" alt="ROUGE-1 Score Comparison" width="49%" />
57
+ </p>
58
+ <p align="center">
59
+ <img src="./assets/rouge-L.png" alt="ROUGE-L Score Comparison" width="49%" />
60
+ <img src="./assets/bleu.png" alt="BLEU Score Comparison" width="49%" />
61
+ </p>
62
 
63
  FP8 quantization showed no measurable quality degradation compared to bf16 precision.
64
 
 
76
 
77
  *Cost calculations based on 700 input tokens and 250 output tokens per generation.
78
 
79
+ <img src="./assets/cost.png" alt="Cost Comparison Per 1 Million Generations" width="100%" />
 
 
80
 
81
  ClipTagger-12b offers **15x cost savings** compared to GPT-4.1 and **17x cost savings** compared to Claude 4 Sonnet, while maintaining comparable quality metrics.
82