StepLaw
/

StepLaw-N_1.0B-D_19.0B-LR3.453e-04-BS262144

@@ -23,7 +23,7 @@ This model is part of the [StepLaw-N_1.0B-D_19.0B](https://huggingface.co/collec
 - **Feed-forward network size (FFN)**: 8192
 - **Attention heads**: 16
 - **Layers**: 16
-- **Parameter count**: 1.1BM
 ### Training Parameters
 - **Learning rate (lr)**: 3.453e-04
@@ -48,7 +48,4 @@ model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
 inputs = tokenizer("A long time ago in a galaxy far, far away", return_tensors="pt")
 outputs = model.generate(**inputs, max_length=100)
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
-```## Part of StepLaw Project
-StepLaw is an initiative to provide thousands of models for optimal hyperparameter research.
-Visit [StepLaw Project](https://step-law.github.io/) for more information.

 - **Feed-forward network size (FFN)**: 8192
 - **Attention heads**: 16
 - **Layers**: 16
+- **Parameter count**: 1.1B
 ### Training Parameters
 - **Learning rate (lr)**: 3.453e-04
 inputs = tokenizer("A long time ago in a galaxy far, far away", return_tensors="pt")
 outputs = model.generate(**inputs, max_length=100)
 print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```