Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
amang1802
/
Llama3.2-1B-summary-length-exp2
like
0
Text Generation
Transformers
Safetensors
llama
conversational
text-generation-inference
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
Model Card for Model ID
Model Details
Model Card for Model ID
Summary Length PPO experiment #2
No KL divergence in loss
Model Details
Dataset size: 1024
Epochs: 2
Batch Size: 4 * 8 (using Grad Accu)
Optimizer args: Torch AdamW default, except
LR = 0.0001
Downloads last month
-
Safetensors
Model size
1.24B params
Tensor type
BF16
·
Chat template
Files info
Inference Providers
NEW
Featherless AI
Text Generation
Examples
Input a message to start chatting with
amang1802/Llama3.2-1B-summary-length-exp2
.
Send
View Code
Snippets
Open Playground
Model tree for
amang1802/Llama3.2-1B-summary-length-exp2
Quantizations
1 model
Collection including
amang1802/Llama3.2-1B-summary-length-exp2
PPO experiments
Collection
Using PPO with simpler reward functions
•
8 items
•
Updated
Jan 23