brendanm12345 commited on
Commit
40d10c3
·
verified ·
1 Parent(s): 6726500

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -9
README.md CHANGED
@@ -16,7 +16,7 @@ language:
16
 
17
  # Weaver Distilled for MMLU-Pro
18
 
19
- A distilled cross-encoder model that captures 98.7% of Weaver's accuracy while reducing verification compute by 99.97%. This model is fine-tuned from ModernBERT-large to predict the correctness of academic reasoning responses, trained on Weaver ensemble scores from 35 different verifiers.
20
 
21
  ## Model Details
22
 
@@ -26,14 +26,6 @@ A distilled cross-encoder model that captures 98.7% of Weaver's accuracy while r
26
  - **Training Data**: MMLU-Pro problems with Weaver scores from 35 LM judges and reward models
27
  - **Task**: Binary classification for answer correctness prediction
28
 
29
- ## Performance
30
-
31
- On MMLU-Pro with Llama 3.1 70B generations:
32
- <!-- TODO: Update with actual performance numbers -->
33
- - **Weaver (Full)**: XX.X% accuracy, high compute cost
34
- - **Weaver (Distilled)**: XX.X% accuracy, 99.97% compute reduction
35
- - **Majority Voting**: XX.X% accuracy
36
-
37
  ## Quick Start
38
 
39
  ```python
 
16
 
17
  # Weaver Distilled for MMLU-Pro
18
 
19
+ This is a distilled cross-encoder model based on ModernBERT-large, trained to predict the correctness of answers on MMLU Pro. This specialized verifier was trained on Weaver scores aggregated over 35 different verifiers and reward models.
20
 
21
  ## Model Details
22
 
 
26
  - **Training Data**: MMLU-Pro problems with Weaver scores from 35 LM judges and reward models
27
  - **Task**: Binary classification for answer correctness prediction
28
 
 
 
 
 
 
 
 
 
29
  ## Quick Start
30
 
31
  ```python