StepLaw
/

StepLaw-N_119M-D_3.0B-LR7.812e-03-BS262144

@@ -23,7 +23,7 @@ This model is part of the [StepLaw-N_119M-D_3.0B](https://huggingface.co/collect
 - **Feed-forward network size (FFN)**: 6416
 - **Attention heads**: 12
 - **Layers**: 7
-- **Parameter count**: 119MM
 ### Training Parameters
 - **Learning rate (lr)**: 7.812e-03
@@ -48,7 +48,4 @@ model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
 inputs = tokenizer("A long time ago in a galaxy far, far away", return_tensors="pt")
 outputs = model.generate(**inputs, max_length=100)
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
-```## Part of StepLaw Project
-StepLaw is an initiative to provide thousands of models for optimal hyperparameter research.
-Visit [StepLaw Project](https://step-law.github.io/) for more information.

 - **Feed-forward network size (FFN)**: 6416
 - **Attention heads**: 12
 - **Layers**: 7
+- **Parameter count**: 119M
 ### Training Parameters
 - **Learning rate (lr)**: 7.812e-03
 inputs = tokenizer("A long time ago in a galaxy far, far away", return_tensors="pt")
 outputs = model.generate(**inputs, max_length=100)
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```