|  | --- | 
					
						
						|  | library_name: peft | 
					
						
						|  | license: other | 
					
						
						|  | base_model: mistralai/Ministral-8B-Instruct-2410 | 
					
						
						|  | tags: | 
					
						
						|  | - llama-factory | 
					
						
						|  | - lora | 
					
						
						|  | - generated_from_trainer | 
					
						
						|  | model-index: | 
					
						
						|  | - name: Ministral-8B-Instruct-2410-PsyCourse-doc-fold4 | 
					
						
						|  | results: [] | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | <!-- This model card has been generated automatically according to the information the Trainer had access to. You | 
					
						
						|  | should probably proofread and complete it, then remove this comment. --> | 
					
						
						|  |  | 
					
						
						|  | # Ministral-8B-Instruct-2410-PsyCourse-doc-fold4 | 
					
						
						|  |  | 
					
						
						|  | This model is a fine-tuned version of [mistralai/Ministral-8B-Instruct-2410](https://huggingface.co/mistralai/Ministral-8B-Instruct-2410) on an unknown dataset. | 
					
						
						|  | It achieves the following results on the evaluation set: | 
					
						
						|  | - Loss: 0.0154 | 
					
						
						|  |  | 
					
						
						|  | ## Model description | 
					
						
						|  |  | 
					
						
						|  | More information needed | 
					
						
						|  |  | 
					
						
						|  | ## Intended uses & limitations | 
					
						
						|  |  | 
					
						
						|  | More information needed | 
					
						
						|  |  | 
					
						
						|  | ## Training and evaluation data | 
					
						
						|  |  | 
					
						
						|  | More information needed | 
					
						
						|  |  | 
					
						
						|  | ## Training procedure | 
					
						
						|  |  | 
					
						
						|  | ### Training hyperparameters | 
					
						
						|  |  | 
					
						
						|  | The following hyperparameters were used during training: | 
					
						
						|  | - learning_rate: 0.0001 | 
					
						
						|  | - train_batch_size: 1 | 
					
						
						|  | - eval_batch_size: 1 | 
					
						
						|  | - seed: 42 | 
					
						
						|  | - gradient_accumulation_steps: 16 | 
					
						
						|  | - total_train_batch_size: 16 | 
					
						
						|  | - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments | 
					
						
						|  | - lr_scheduler_type: cosine | 
					
						
						|  | - lr_scheduler_warmup_ratio: 0.1 | 
					
						
						|  | - num_epochs: 5.0 | 
					
						
						|  |  | 
					
						
						|  | ### Training results | 
					
						
						|  |  | 
					
						
						|  | | Training Loss | Epoch  | Step | Validation Loss | | 
					
						
						|  | |:-------------:|:------:|:----:|:---------------:| | 
					
						
						|  | | 0.0866        | 0.3951 | 10   | 0.0700          | | 
					
						
						|  | | 0.0303        | 0.7901 | 20   | 0.0278          | | 
					
						
						|  | | 0.0158        | 1.1852 | 30   | 0.0200          | | 
					
						
						|  | | 0.017         | 1.5802 | 40   | 0.0180          | | 
					
						
						|  | | 0.0132        | 1.9753 | 50   | 0.0162          | | 
					
						
						|  | | 0.01          | 2.3704 | 60   | 0.0156          | | 
					
						
						|  | | 0.0147        | 2.7654 | 70   | 0.0154          | | 
					
						
						|  | | 0.0114        | 3.1605 | 80   | 0.0152          | | 
					
						
						|  | | 0.0105        | 3.5556 | 90   | 0.0152          | | 
					
						
						|  | | 0.0116        | 3.9506 | 100  | 0.0154          | | 
					
						
						|  | | 0.0103        | 4.3457 | 110  | 0.0153          | | 
					
						
						|  | | 0.0075        | 4.7407 | 120  | 0.0154          | | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | ### Framework versions | 
					
						
						|  |  | 
					
						
						|  | - PEFT 0.12.0 | 
					
						
						|  | - Transformers 4.46.1 | 
					
						
						|  | - Pytorch 2.5.1+cu124 | 
					
						
						|  | - Datasets 3.1.0 | 
					
						
						|  | - Tokenizers 0.20.3 |