hazyresearch
/

Weaver_Distilled_ModernBERT_Large_for_MMLU-Pro

Text Classification

Model card Files Files and versions

jonsaadfalcon commited on Jun 10

Commit

b302259

·

verified ·

1 Parent(s): 44ae9db

Create README.md

Files changed (1) hide show

README.md +60 -0

README.md ADDED Viewed

	@@ -0,0 +1,60 @@

+# Weaver Distilled - MMLU Pro (ModernBERT-large)
+This is a distilled cross-encoder model based on ModernBERT-large, trained to predict the correctness of answers across multiple domains. This general-purpose verifier was trained on a combined dataset of 35 different verifiers and reward models aggregated using Weaver.
+## Model Details
+- **Base Model**: [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large)
+- **Architecture**: Cross-encoder with MLP head (1024 → 512 → 256 → 1)
+- **Max Sequence Length**: 4096
+- **Training Data**: Combined dataset from 35 different LM Judges and reward models aggregated with Weaver
+- **Training Objective**: Binary classification (correct/incorrect answer prediction)
+## Usage
+```python
+from custom_crossencoder import CustomCrossEncoder, TrainingConfig
+# Initialize model
+config = TrainingConfig(
+    model_name="answerdotai/ModernBERT-large",
+    max_length=4096,
+    mlp_hidden_dims=[1024, 512, 256]
+)
+model = CustomCrossEncoder(config)
+# Load checkpoint
+model.load_state_dict(torch.load("hazyresearch/Weaver_Distilled_ModernBERT_Large_for_MMLU-Pro"))
+model.eval()
+# Get prediction
+instruction = "Your instruction here"
+answer = "Your answer here"
+encoded = model.tokenizer(
+    text=instruction,
+    text_pair=answer,
+    truncation=True,
+    max_length=4096,
+    padding="max_length",
+    return_tensors="pt"
+)
+with torch.no_grad():
+    prediction = model(encoded["input_ids"], encoded["attention_mask"])
+```
+## Running Evaluation
+TODO: ADD EVALUATION_SIMPLE COMMAND HERE
+## License
+[Your chosen license]
+## Citation
+If you use this model in your research, please cite:
+```bibtex
+TODO
+```