End of training
Browse files- README.md +2 -1
- all_results.json +13 -0
- eval_results.json +8 -0
- train_results.json +9 -0
- trainer_state.json +0 -0
- training_eval_loss.png +0 -0
- training_loss.png +0 -0
    	
        README.md
    CHANGED
    
    | @@ -4,6 +4,7 @@ license: llama3 | |
| 4 | 
             
            base_model: meta-llama/Meta-Llama-3-8B-Instruct
         | 
| 5 | 
             
            tags:
         | 
| 6 | 
             
            - llama-factory
         | 
|  | |
| 7 | 
             
            - generated_from_trainer
         | 
| 8 | 
             
            model-index:
         | 
| 9 | 
             
            - name: train_cola_1757340184
         | 
| @@ -15,7 +16,7 @@ should probably proofread and complete it, then remove this comment. --> | |
| 15 |  | 
| 16 | 
             
            # train_cola_1757340184
         | 
| 17 |  | 
| 18 | 
            -
            This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on  | 
| 19 | 
             
            It achieves the following results on the evaluation set:
         | 
| 20 | 
             
            - Loss: 0.9521
         | 
| 21 | 
             
            - Num Input Tokens Seen: 6929680
         | 
|  | |
| 4 | 
             
            base_model: meta-llama/Meta-Llama-3-8B-Instruct
         | 
| 5 | 
             
            tags:
         | 
| 6 | 
             
            - llama-factory
         | 
| 7 | 
            +
            - prefix-tuning
         | 
| 8 | 
             
            - generated_from_trainer
         | 
| 9 | 
             
            model-index:
         | 
| 10 | 
             
            - name: train_cola_1757340184
         | 
|  | |
| 16 |  | 
| 17 | 
             
            # train_cola_1757340184
         | 
| 18 |  | 
| 19 | 
            +
            This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the cola dataset.
         | 
| 20 | 
             
            It achieves the following results on the evaluation set:
         | 
| 21 | 
             
            - Loss: 0.9521
         | 
| 22 | 
             
            - Num Input Tokens Seen: 6929680
         | 
    	
        all_results.json
    ADDED
    
    | @@ -0,0 +1,13 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            {
         | 
| 2 | 
            +
                "epoch": 20.0,
         | 
| 3 | 
            +
                "eval_loss": 0.9521387219429016,
         | 
| 4 | 
            +
                "eval_runtime": 13.1396,
         | 
| 5 | 
            +
                "eval_samples_per_second": 65.147,
         | 
| 6 | 
            +
                "eval_steps_per_second": 32.573,
         | 
| 7 | 
            +
                "num_input_tokens_seen": 6929680,
         | 
| 8 | 
            +
                "total_flos": 3.1204035840638976e+17,
         | 
| 9 | 
            +
                "train_loss": 0.23893776387709736,
         | 
| 10 | 
            +
                "train_runtime": 6717.4331,
         | 
| 11 | 
            +
                "train_samples_per_second": 22.911,
         | 
| 12 | 
            +
                "train_steps_per_second": 11.457
         | 
| 13 | 
            +
            }
         | 
    	
        eval_results.json
    ADDED
    
    | @@ -0,0 +1,8 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            {
         | 
| 2 | 
            +
                "epoch": 20.0,
         | 
| 3 | 
            +
                "eval_loss": 0.9521387219429016,
         | 
| 4 | 
            +
                "eval_runtime": 13.1396,
         | 
| 5 | 
            +
                "eval_samples_per_second": 65.147,
         | 
| 6 | 
            +
                "eval_steps_per_second": 32.573,
         | 
| 7 | 
            +
                "num_input_tokens_seen": 6929680
         | 
| 8 | 
            +
            }
         | 
    	
        train_results.json
    ADDED
    
    | @@ -0,0 +1,9 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            {
         | 
| 2 | 
            +
                "epoch": 20.0,
         | 
| 3 | 
            +
                "num_input_tokens_seen": 6929680,
         | 
| 4 | 
            +
                "total_flos": 3.1204035840638976e+17,
         | 
| 5 | 
            +
                "train_loss": 0.23893776387709736,
         | 
| 6 | 
            +
                "train_runtime": 6717.4331,
         | 
| 7 | 
            +
                "train_samples_per_second": 22.911,
         | 
| 8 | 
            +
                "train_steps_per_second": 11.457
         | 
| 9 | 
            +
            }
         | 
    	
        trainer_state.json
    ADDED
    
    | The diff for this file is too large to render. 
		See raw diff | 
|  | 
    	
        training_eval_loss.png
    ADDED
    
    |   | 
    	
        training_loss.png
    ADDED
    
    |   | 
