brendanm12345 commited on
Commit
7155c7a
·
verified ·
1 Parent(s): 8f352d6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -10
README.md CHANGED
@@ -15,7 +15,7 @@ language:
15
 
16
  # Weaver Distilled for MATH500
17
 
18
- A distilled cross-encoder model that captures 98.7% of Weaver's accuracy while reducing verification compute by 99.97%. This model is fine-tuned from ModernBERT-large to predict the correctness of mathematical reasoning responses, trained on Weaver ensemble scores from 35 different verifiers.
19
 
20
  ## Model Details
21
 
@@ -25,15 +25,6 @@ A distilled cross-encoder model that captures 98.7% of Weaver's accuracy while r
25
  - **Training Data**: MATH500 problems with Weaver scores from 35 LM judges and reward models
26
  - **Task**: Binary classification for answer correctness prediction
27
 
28
- ## Performance
29
-
30
- On MATH500 with Llama 3.1 70B generations:
31
- - **Weaver (Full)**: 93.4% accuracy, high compute cost
32
- - **Weaver (Distilled)**: 92.2% accuracy, 99.97% compute reduction
33
- - **Majority Voting**: 83.0% accuracy
34
-
35
- TODO: replace these with the actual numbers
36
-
37
  ## Quick Start
38
 
39
  ```python
 
15
 
16
  # Weaver Distilled for MATH500
17
 
18
+ This is a distilled cross-encoder model based on ModernBERT-large, trained to predict the correctness of answers on MATH500. This specialized verifier was trained on Weaver scores aggregated over 35 different verifiers and reward models.
19
 
20
  ## Model Details
21
 
 
25
  - **Training Data**: MATH500 problems with Weaver scores from 35 LM judges and reward models
26
  - **Task**: Binary classification for answer correctness prediction
27
 
 
 
 
 
 
 
 
 
 
28
  ## Quick Start
29
 
30
  ```python