mmBERT: a modern multilingual encoder mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance jhu-clsp/mmBERT-base Fill-Mask • Updated 18 days ago • 54.3k • • 133 jhu-clsp/mmBERT-small Fill-Mask • Updated 18 days ago • 7.48k • • 43 jhu-clsp/mmBERT-checkpoints Updated 22 days ago • 2 jhu-clsp/mmBERT-pretrain-p1-fineweb2-langs Updated 17 days ago • 5.42k • 3
Encoders vs Decoders: the Ettin Suite A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 Seq vs Seq: An Open Suite of Paired Encoders and Decoders Paper • 2507.11412 • Published Jul 15 • 28 jhu-clsp/ettin-encoder-17m Fill-Mask • Updated Jul 16 • 3.95k • 8 jhu-clsp/ettin-encoder-32m Feature Extraction • Updated Jul 18 • 2.66k • • 5 jhu-clsp/ettin-encoder-68m Fill-Mask • Updated Jul 18 • 676 • • 3
mmBERT: a modern multilingual encoder mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance jhu-clsp/mmBERT-base Fill-Mask • Updated 18 days ago • 54.3k • • 133 jhu-clsp/mmBERT-small Fill-Mask • Updated 18 days ago • 7.48k • • 43 jhu-clsp/mmBERT-checkpoints Updated 22 days ago • 2 jhu-clsp/mmBERT-pretrain-p1-fineweb2-langs Updated 17 days ago • 5.42k • 3
Encoders vs Decoders: the Ettin Suite A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 Seq vs Seq: An Open Suite of Paired Encoders and Decoders Paper • 2507.11412 • Published Jul 15 • 28 jhu-clsp/ettin-encoder-17m Fill-Mask • Updated Jul 16 • 3.95k • 8 jhu-clsp/ettin-encoder-32m Feature Extraction • Updated Jul 18 • 2.66k • • 5 jhu-clsp/ettin-encoder-68m Fill-Mask • Updated Jul 18 • 676 • • 3