metadata
license: apache-2.0
tags:
- StepLaw
- causal-lm
language:
- en
library_name: transformers
pipeline_tag: text-generation
model-index:
- name: >-
step2v2_0618_h1280_ffnh9472_numh10_numl10_lr5.524e-03_bs128_ti381469_mlr1e-5
results: []
StepLaw models are trained with various hyperparameter settings to enable research on scaling laws and hyperparameter optimization. This specific model was trained with learning rate 5.524e-03 and batch size 262144 for 381469 iterations, using a total of 100.0B training tokens.