train-once-answer-all
Collection
Model checkpoints and training data modifications for the paper "Train Once, Answer All: Many Pretraining Experiments for the Cost of One"
•
3 items
•
Updated
This model is a research variant of OLMo-2-0425-1B.
It was pretrained from scratch on 210B tokens with additional experimental modifications to the training data.
The baseline model, trained on the same data without any experiments, is here.
The model is described in the paper "Train Once, Answer All: Many Pretraining Experiments for the Cost of One".
from transformers import AutoModelForCausalLM, AutoTokenizer
olmo = AutoModelForCausalLM.from_pretrained("sbordt/OLMo-2-1B-Exp")
tokenizer = AutoTokenizer.from_pretrained("sbordt/OLMo-2-1B-Exp")
@article{bordt2025trainonce,
title = {Train Once, Answer All: Many Pretraining Experiments for the Cost of One},
author = {Bordt, Sebastian and Pawelczyk, Martin},
journal = {arXiv preprint arXiv:2509.23383},
year = {2025},
}