models

This model is a fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct on the chatbot_enseignement_train dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0036

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 2
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.0205 0.9524 100 0.0302
0.0114 1.9048 200 0.0088
0.0025 2.8571 300 0.0038
0.003 3.8095 400 0.0035
0.0042 4.7619 500 0.0036

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
44
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for echarif/qwen_lora_adapter

Base model

Qwen/Qwen2.5-1.5B
Adapter
(420)
this model

Spaces using echarif/qwen_lora_adapter 2