train_cola_1757340163

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

Loss: 0.1781
Num Input Tokens Seen: 3668584

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.3414	0.5	962	0.2023	183584
0.1222	1.0	1924	0.2033	366856
0.1831	1.5	2886	0.1787	550664
0.1471	2.0	3848	0.1856	734320
0.0442	2.5	4810	0.1907	918128
0.1045	3.0	5772	0.1781	1100800
0.0049	3.5	6734	0.1929	1284064
0.0065	4.0	7696	0.2062	1467824
0.0006	4.5	8658	0.2846	1650992
0.0021	5.0	9620	0.2537	1834632
0.0011	5.5	10582	0.2606	2018408
0.1166	6.0	11544	0.2492	2202264
0.0002	6.5	12506	0.4091	2386136
0.0	7.0	13468	0.3892	2568880
0.0002	7.5	14430	0.3723	2751696
0.0	8.0	15392	0.4005	2935520
0.0	8.5	16354	0.4445	3119168
0.0704	9.0	17316	0.4477	3302192
0.0	9.5	18278	0.4609	3485264
0.0038	10.0	19240	0.4613	3668584

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.8.0+cu128
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 21

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_1757340163

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(1548)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard