Model Card for OLMo-2-1B-Exp

This model is a research variant of OLMo-2-0425-1B.

It was pretrained from scratch on 210B tokens with additional experimental modifications to the training data.

The baseline model, trained on the same data without any experiments, is here.

The model is described in the paper "Train Once, Answer All: Many Pretraining Experiments for the Cost of One".

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
olmo = AutoModelForCausalLM.from_pretrained("sbordt/OLMo-2-1B-Exp")
tokenizer = AutoTokenizer.from_pretrained("sbordt/OLMo-2-1B-Exp")

Citation Information

@article{bordt2025trainonce,
  title =     {Train Once, Answer All: Many Pretraining Experiments for the Cost of One},
  author =    {Bordt, Sebastian and Pawelczyk, Martin},
  journal =   {arXiv preprint arXiv:2509.23383},
  year =      {2025},
}
Downloads last month
845
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including sbordt/OLMo-2-1B-Exp