suchirsalhan's picture
Update README.md
6a7453e verified
|
raw
history blame
883 Bytes
---
tags:
- babylm
- language-model
- coherence
license: mit
language:
- uk
---
# babybabellm-mono-ukr
This repository contains checkpoints for the **mono-ukr** variant of **BabyBabeLLM**.
## Files
- `*_15_16.bin` – main model weights
- `*_15_16_ema.bin` – EMA smoothed weights
- `*_15_16_state_dict.bin` – PyTorch state dict
- `pytorch_model.bin` – extracted EMA weights (for AutoModel)
- Config + tokenizer files for model loading
## Usage
```python
from transformers import AutoModel, AutoTokenizer
repo = "suchirsalhan/babybabellm-mono-ukr"
tokenizer = AutoTokenizer.from_pretrained(repo)
model = AutoModel.from_pretrained(repo)
inputs = tokenizer("Hello world!", return_tensors="pt")
outputs = model(**inputs)
```
## Notes
- These are research checkpoints trained on BabyLM-style data.
- Model naming: `mono-ukr` indicates the language/config variant.