Model Card for OLMo-2-1B-Exp

This model is a research variant of OLMo-2-0425-1B.

It was pretrained from scratch on 210B tokens with additional experimental modifications to the training data.

The baseline model, trained on the same data without any experiments, is here.

The model is described in the paper "Train Once, Answer All: Many Pretraining Experiments for the Cost of One".

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
olmo = AutoModelForCausalLM.from_pretrained("sbordt/OLMo-2-1B-Exp")
tokenizer = AutoTokenizer.from_pretrained("sbordt/OLMo-2-1B-Exp")

Citation Information

@article{bordt2025trainonce,
  title =     {Train Once, Answer All: Many Pretraining Experiments for the Cost of One},
  author =    {Bordt, Sebastian and Pawelczyk, Martin},
  journal =   {arXiv preprint arXiv:2509.23383},
  year =      {2025},
}

Downloads last month: 845

Safetensors

Model size

1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including sbordt/OLMo-2-1B-Exp

train-once-answer-all

Collection

Model checkpoints and training data modifications for the paper "Train Once, Answer All: Many Pretraining Experiments for the Cost of One" • 3 items • Updated 18 days ago