hazyresearch
/

Weaver_Distilled_ModernBERT_Large_for_MMLU-Pro

Text Classification

Model card Files Files and versions

jonsaadfalcon commited on Jun 10

Commit

181b88e

·

verified ·

1 Parent(s): ed8eda8

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -11,7 +11,7 @@ This is a distilled cross-encoder model based on ModernBERT-large, trained to pr
 - **Base Model**: [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large)
 - **Architecture**: Cross-encoder with MLP head (1024 → 512 → 256 → 1)
 - **Max Sequence Length**: 4096
-- **Training Data**: MMLU Pro Subset (500 queries) scored by 35 different LM Judges and reward models, aggregated to form sample-level scores with Weaver
 - **Training Objective**: Binary classification (correct/incorrect answer prediction)
 ## Usage

 - **Base Model**: [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large)
 - **Architecture**: Cross-encoder with MLP head (1024 → 512 → 256 → 1)
 - **Max Sequence Length**: 4096
+- **Training Data**: [MMLU Pro](https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro) Subset (500 queries) scored by 35 different LM Judges and reward models, aggregated to form sample-level scores with Weaver
 - **Training Objective**: Binary classification (correct/incorrect answer prediction)
 ## Usage