Missmatch between SmolLM2-360M-intermediate-checkpoints and SmolLM2-360M performance
#9 opened about 6 hours ago
by
Tobi-r9
need clarification on number of checkpoints
#8 opened 2 months ago
by
bedio
More Training Information Required
🔥
2
#7 opened 4 months ago
by
jayan12k
Sentencepiece tokenizer
#6 opened 7 months ago
by
bh4
B/c Size Mismatch, Cant use from transformers import LlamaForCausalLM as workaround.
1
#5 opened 7 months ago
by
MartialTerran
Safetensors size mismatch.
5
#4 opened 7 months ago
by
MartialTerran
Sample Model Script for bfloat16 downloads safetensors parameters files then declares mismatch in their dimensions.
1
#3 opened 7 months ago
by
MartialTerran
Need Help to build a SmolLM2_360M_model.py
1
#2 opened 7 months ago
by
MartialTerran
Reproducing Evaluation with lighteval
4
#1 opened 7 months ago
by
PatrickHaller