End of training
Browse files- README.md +11 -4
- adapter_model.bin +1 -1
    	
        README.md
    CHANGED
    
    | @@ -66,7 +66,7 @@ lora_model_dir: null | |
| 66 | 
             
            lora_r: 8
         | 
| 67 | 
             
            lora_target_linear: true
         | 
| 68 | 
             
            lr_scheduler: cosine
         | 
| 69 | 
            -
            max_steps:  | 
| 70 | 
             
            micro_batch_size: 2
         | 
| 71 | 
             
            mlflow_experiment_name: /tmp/52a0af70d05ca085_train_data.json
         | 
| 72 | 
             
            model_type: AutoModelForCausalLM
         | 
| @@ -91,7 +91,7 @@ wandb_name: 3cc7522e-b8b9-4231-a45b-19615c5cf651 | |
| 91 | 
             
            wandb_project: Gradients-On-Demand
         | 
| 92 | 
             
            wandb_run: your_name
         | 
| 93 | 
             
            wandb_runid: 3cc7522e-b8b9-4231-a45b-19615c5cf651
         | 
| 94 | 
            -
            warmup_steps:  | 
| 95 | 
             
            weight_decay: 0.0
         | 
| 96 | 
             
            xformers_attention: null
         | 
| 97 |  | 
| @@ -102,6 +102,8 @@ xformers_attention: null | |
| 102 | 
             
            # b71353e5-235c-462a-81db-aa926fed5d78
         | 
| 103 |  | 
| 104 | 
             
            This model is a fine-tuned version of [unsloth/Qwen2.5-0.5B](https://huggingface.co/unsloth/Qwen2.5-0.5B) on the None dataset.
         | 
|  | |
|  | |
| 105 |  | 
| 106 | 
             
            ## Model description
         | 
| 107 |  | 
| @@ -128,14 +130,19 @@ The following hyperparameters were used during training: | |
| 128 | 
             
            - total_train_batch_size: 8
         | 
| 129 | 
             
            - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
         | 
| 130 | 
             
            - lr_scheduler_type: cosine
         | 
| 131 | 
            -
            - lr_scheduler_warmup_steps:  | 
| 132 | 
            -
            - training_steps:  | 
| 133 |  | 
| 134 | 
             
            ### Training results
         | 
| 135 |  | 
| 136 | 
             
            | Training Loss | Epoch  | Step | Validation Loss |
         | 
| 137 | 
             
            |:-------------:|:------:|:----:|:---------------:|
         | 
| 138 | 
             
            | No log        | 0.0002 | 1    | nan             |
         | 
|  | |
|  | |
|  | |
|  | |
|  | |
| 139 |  | 
| 140 |  | 
| 141 | 
             
            ### Framework versions
         | 
|  | |
| 66 | 
             
            lora_r: 8
         | 
| 67 | 
             
            lora_target_linear: true
         | 
| 68 | 
             
            lr_scheduler: cosine
         | 
| 69 | 
            +
            max_steps: 50
         | 
| 70 | 
             
            micro_batch_size: 2
         | 
| 71 | 
             
            mlflow_experiment_name: /tmp/52a0af70d05ca085_train_data.json
         | 
| 72 | 
             
            model_type: AutoModelForCausalLM
         | 
|  | |
| 91 | 
             
            wandb_project: Gradients-On-Demand
         | 
| 92 | 
             
            wandb_run: your_name
         | 
| 93 | 
             
            wandb_runid: 3cc7522e-b8b9-4231-a45b-19615c5cf651
         | 
| 94 | 
            +
            warmup_steps: 10
         | 
| 95 | 
             
            weight_decay: 0.0
         | 
| 96 | 
             
            xformers_attention: null
         | 
| 97 |  | 
|  | |
| 102 | 
             
            # b71353e5-235c-462a-81db-aa926fed5d78
         | 
| 103 |  | 
| 104 | 
             
            This model is a fine-tuned version of [unsloth/Qwen2.5-0.5B](https://huggingface.co/unsloth/Qwen2.5-0.5B) on the None dataset.
         | 
| 105 | 
            +
            It achieves the following results on the evaluation set:
         | 
| 106 | 
            +
            - Loss: nan
         | 
| 107 |  | 
| 108 | 
             
            ## Model description
         | 
| 109 |  | 
|  | |
| 130 | 
             
            - total_train_batch_size: 8
         | 
| 131 | 
             
            - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
         | 
| 132 | 
             
            - lr_scheduler_type: cosine
         | 
| 133 | 
            +
            - lr_scheduler_warmup_steps: 10
         | 
| 134 | 
            +
            - training_steps: 50
         | 
| 135 |  | 
| 136 | 
             
            ### Training results
         | 
| 137 |  | 
| 138 | 
             
            | Training Loss | Epoch  | Step | Validation Loss |
         | 
| 139 | 
             
            |:-------------:|:------:|:----:|:---------------:|
         | 
| 140 | 
             
            | No log        | 0.0002 | 1    | nan             |
         | 
| 141 | 
            +
            | 0.0           | 0.0023 | 10   | nan             |
         | 
| 142 | 
            +
            | 0.0           | 0.0046 | 20   | nan             |
         | 
| 143 | 
            +
            | 0.0           | 0.0068 | 30   | nan             |
         | 
| 144 | 
            +
            | 0.0           | 0.0091 | 40   | nan             |
         | 
| 145 | 
            +
            | 0.0           | 0.0114 | 50   | nan             |
         | 
| 146 |  | 
| 147 |  | 
| 148 | 
             
            ### Framework versions
         | 
    	
        adapter_model.bin
    CHANGED
    
    | @@ -1,3 +1,3 @@ | |
| 1 | 
             
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            -
            oid sha256: | 
| 3 | 
             
            size 17717130
         | 
|  | |
| 1 | 
             
            version https://git-lfs.github.com/spec/v1
         | 
| 2 | 
            +
            oid sha256:ade263fb46e8764954887bcb71f29c9e84566cf30bb97af9abc070785a681bf8
         | 
| 3 | 
             
            size 17717130
         |