train_record_1753094160

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the record dataset. It achieves the following results on the evaluation set:

Loss: 0.2664
Num Input Tokens Seen: 464483424

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 123
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 10.0

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.2352	0.5	15621	0.3098	23227520
0.2159	1.0	31242	0.2921	46454112
0.2584	1.5	46863	0.2711	69694624
0.179	2.0	62484	0.2664	92908288
0.1124	2.5	78105	0.2872	116099296
0.1993	3.0	93726	0.2908	139351808
0.1558	3.5	109347	0.3250	162566976
0.1782	4.0	124968	0.3064	185790304
0.1187	4.5	140589	0.3423	208997696
0.0991	5.0	156210	0.3587	232243968
0.1312	5.5	171831	0.4016	255458112
0.2085	6.0	187452	0.3625	278686752
0.1709	6.5	203073	0.4192	301925344
0.1077	7.0	218694	0.4075	325137568
0.1939	7.5	234315	0.4813	348361920
0.0795	8.0	249936	0.4618	371592704
0.1507	8.5	265557	0.5457	394838368
0.0835	9.0	281178	0.5146	418033696
0.1975	9.5	296799	0.6221	441282560
0.1145	10.0	312420	0.6266	464483424

Framework versions

PEFT 0.15.2
Transformers 4.51.3
Pytorch 2.7.1+cu126
Datasets 3.6.0
Tokenizers 0.21.1

rbelanec
/

train_record_1753094160

train_record_1753094160

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_record_1753094160

Evaluation results