train_svamp_42_1757596062

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the svamp dataset. It achieves the following results on the evaluation set:

Loss: 0.0463
Num Input Tokens Seen: 704336

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.1293	0.5	79	0.1236	35680
0.0551	1.0	158	0.0618	70512
0.0331	1.5	237	0.0569	105904
0.0195	2.0	316	0.0463	140960
0.0115	2.5	395	0.0491	176096
0.0183	3.0	474	0.0465	211424
0.0066	3.5	553	0.0616	246784
0.0597	4.0	632	0.0602	281968
0.0038	4.5	711	0.0668	317232
0.0033	5.0	790	0.0628	352368
0.0009	5.5	869	0.0656	387824
0.0036	6.0	948	0.0823	422704
0.0	6.5	1027	0.0790	457744
0.0001	7.0	1106	0.0817	493200
0.0	7.5	1185	0.0815	528304
0.0002	8.0	1264	0.0827	563520
0.0001	8.5	1343	0.0820	599072
0.0	9.0	1422	0.0835	634176
0.0	9.5	1501	0.0838	669440
0.0	10.0	1580	0.0835	704336

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_svamp_42_1757596062

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2016)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard