jonsaadfalcon's picture
Update README.md
d9a3b70 verified
|
raw
history blame
1.78 kB
metadata
license: apache-2.0

Weaver Distilled - MMLU Pro (ModernBERT-large)

This is a distilled cross-encoder model based on ModernBERT-large, trained to predict the correctness of answers on MMLU Pro. This specialized verifier was trained on Weaver scores aggregated over 35 different verifiers and reward models.

Model Details

  • Base Model: answerdotai/ModernBERT-large
  • Architecture: Cross-encoder with MLP head (1024 → 512 → 256 → 1)
  • Max Sequence Length: 4096
  • Training Data: MMLU Pro Subset (500 queries) scored by 35 different LM Judges and reward models, aggregated to form sample-level scores with Weaver
  • Training Objective: Binary classification (correct/incorrect answer prediction)

Usage

from custom_crossencoder import CustomCrossEncoder, TrainingConfig

# Initialize model
config = TrainingConfig(
    model_name="answerdotai/ModernBERT-large",
    max_length=4096,
    mlp_hidden_dims=[1024, 512, 256]
)
model = CustomCrossEncoder(config)

# Load checkpoint
model.load_state_dict(torch.load("hazyresearch/Weaver_Distilled_ModernBERT_Large_for_MMLU-Pro"))
model.eval()

# Get prediction
instruction = "Your instruction here"
answer = "Your answer here"
encoded = model.tokenizer(
    text=instruction,
    text_pair=answer,
    truncation=True,
    max_length=4096,
    padding="max_length",
    return_tensors="pt"
)
with torch.no_grad():
    prediction = model(encoded["input_ids"], encoded["attention_mask"])

Running Evaluation

TODO: ADD EVALUATION_SIMPLE COMMAND HERE

License

[Your chosen license]

Citation

If you use this model in your research, please cite:

TODO