metadata
tags:
- babylm
- language-model
- coherence
license: mit
language:
- uk
babybabellm-mono-ukr
This repository contains checkpoints for the mono-ukr variant of BabyBabeLLM.
Files
*_15_16.bin
– main model weights*_15_16_ema.bin
– EMA smoothed weights*_15_16_state_dict.bin
– PyTorch state dictpytorch_model.bin
– extracted EMA weights (for AutoModel)- Config + tokenizer files for model loading
Usage
from transformers import AutoModel, AutoTokenizer
repo = "suchirsalhan/babybabellm-mono-ukr"
tokenizer = AutoTokenizer.from_pretrained(repo)
model = AutoModel.from_pretrained(repo)
inputs = tokenizer("Hello world!", return_tensors="pt")
outputs = model(**inputs)
Notes
- These are research checkpoints trained on BabyLM-style data.
- Model naming:
mono-ukr
indicates the language/config variant.