feihu.hf
		
	commited on
		
		
					Commit 
							
							·
						
						df637d4
	
1
								Parent(s):
							
							39a3378
								
update README.md
Browse files- README.md +13 -35
- config.json +2 -2
    	
        README.md
    CHANGED
    
    | @@ -1,12 +1,12 @@ | |
| 1 | 
             
            ---
         | 
| 2 | 
            -
            base_model:
         | 
| 3 | 
            -
            - Qwen/Qwen2.5-Coder-1.5B-Instruct
         | 
| 4 | 
            -
            language:
         | 
| 5 | 
            -
            - en
         | 
| 6 | 
            -
            library_name: transformers
         | 
| 7 | 
             
            license: apache-2.0
         | 
| 8 | 
             
            license_link: https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct/blob/main/LICENSE
         | 
|  | |
|  | |
|  | |
|  | |
| 9 | 
             
            pipeline_tag: text-generation
         | 
|  | |
| 10 | 
             
            tags:
         | 
| 11 | 
             
            - code
         | 
| 12 | 
             
            - codeqwen
         | 
| @@ -20,9 +20,9 @@ tags: | |
| 20 |  | 
| 21 | 
             
            ## Introduction
         | 
| 22 |  | 
| 23 | 
            -
            Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen).  | 
| 24 |  | 
| 25 | 
            -
            - Significantly improvements in **code generation**, **code reasoning** and **code fixing**. Base on the strong Qwen2.5, we scale up the training tokens into 5.5 trillion including source code, text-code grounding, Synthetic data, etc. 
         | 
| 26 | 
             
            - A more comprehensive foundation for real-world applications such as **Code Agents**. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies.
         | 
| 27 |  | 
| 28 | 
             
            **This repo contains the AWQ-quantized 4-bit instruction-tuned 1.5B Qwen2.5-Coder model**, which has the following features:
         | 
| @@ -34,10 +34,9 @@ Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models ( | |
| 34 | 
             
            - Number of Layers: 28
         | 
| 35 | 
             
            - Number of Attention Heads (GQA): 12 for Q and 2 for KV
         | 
| 36 | 
             
            - Context Length: Full 32,768 tokens
         | 
| 37 | 
            -
              - Please refer to [this section](#processing-long-texts) for detailed instructions on how to deploy Qwen2.5 for handling long texts.
         | 
| 38 | 
             
            - Quantization: AWQ 4-bit
         | 
| 39 |  | 
| 40 | 
            -
            For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5-coder/), [GitHub](https://github.com/QwenLM/Qwen2.5-Coder), [Documentation](https://qwen.readthedocs.io/en/latest/), [Arxiv](https://arxiv.org/abs/2409.12186).
         | 
| 41 |  | 
| 42 | 
             
            ## Requirements
         | 
| 43 |  | 
| @@ -89,31 +88,10 @@ generated_ids = [ | |
| 89 | 
             
            response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
         | 
| 90 | 
             
            ```
         | 
| 91 |  | 
| 92 | 
            -
            ### Processing Long Texts
         | 
| 93 | 
            -
             | 
| 94 | 
            -
            The current `config.json` is set for context length up to 32,768 tokens.
         | 
| 95 | 
            -
            To handle extensive inputs exceeding 32,768 tokens, we utilize [YaRN](https://arxiv.org/abs/2309.00071), a technique for enhancing model length extrapolation, ensuring optimal performance on lengthy texts.
         | 
| 96 | 
            -
             | 
| 97 | 
            -
            For supported frameworks, you could add the following to `config.json` to enable YaRN:
         | 
| 98 | 
            -
            ```json
         | 
| 99 | 
            -
            {
         | 
| 100 | 
            -
              ...,
         | 
| 101 | 
            -
              "rope_scaling": {
         | 
| 102 | 
            -
                "factor": 4.0,
         | 
| 103 | 
            -
                "original_max_position_embeddings": 32768,
         | 
| 104 | 
            -
                "type": "yarn"
         | 
| 105 | 
            -
              }
         | 
| 106 | 
            -
            }
         | 
| 107 | 
            -
            ```
         | 
| 108 | 
            -
             | 
| 109 | 
            -
            For deployment, we recommend using vLLM. 
         | 
| 110 | 
            -
            Please refer to our [Documentation](https://qwen.readthedocs.io/en/latest/deployment/vllm.html) for usage if you are not familar with vLLM.
         | 
| 111 | 
            -
            Presently, vLLM only supports static YARN, which means the scaling factor remains constant regardless of input length, **potentially impacting performance on shorter texts**. 
         | 
| 112 | 
            -
            We advise adding the `rope_scaling` configuration only when processing long contexts is required.
         | 
| 113 |  | 
| 114 | 
             
            ## Evaluation & Performance
         | 
| 115 |  | 
| 116 | 
            -
            Detailed evaluation results are reported in this [📑 blog](https://qwenlm.github.io/blog/qwen2.5-coder/).
         | 
| 117 |  | 
| 118 | 
             
            For requirements on GPU memory and the respective throughput, see results [here](https://qwen.readthedocs.io/en/latest/benchmark/speed_benchmark.html).
         | 
| 119 |  | 
| @@ -123,10 +101,10 @@ If you find our work helpful, feel free to give us a cite. | |
| 123 |  | 
| 124 | 
             
            ```
         | 
| 125 | 
             
            @article{hui2024qwen2,
         | 
| 126 | 
            -
             | 
| 127 | 
            -
             | 
| 128 | 
            -
             | 
| 129 | 
            -
             | 
| 130 | 
             
            }
         | 
| 131 | 
             
            @article{qwen2,
         | 
| 132 | 
             
                  title={Qwen2 Technical Report}, 
         | 
|  | |
| 1 | 
             
            ---
         | 
|  | |
|  | |
|  | |
|  | |
|  | |
| 2 | 
             
            license: apache-2.0
         | 
| 3 | 
             
            license_link: https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct/blob/main/LICENSE
         | 
| 4 | 
            +
            language:
         | 
| 5 | 
            +
            - en
         | 
| 6 | 
            +
            base_model:
         | 
| 7 | 
            +
            - Qwen/Qwen2.5-Coder-1.5B-Instruct
         | 
| 8 | 
             
            pipeline_tag: text-generation
         | 
| 9 | 
            +
            library_name: transformers
         | 
| 10 | 
             
            tags:
         | 
| 11 | 
             
            - code
         | 
| 12 | 
             
            - codeqwen
         | 
|  | |
| 20 |  | 
| 21 | 
             
            ## Introduction
         | 
| 22 |  | 
| 23 | 
            +
            Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:
         | 
| 24 |  | 
| 25 | 
            +
            - Significantly improvements in **code generation**, **code reasoning** and **code fixing**. Base on the strong Qwen2.5, we scale up the training tokens into 5.5 trillion including source code, text-code grounding, Synthetic data, etc. Qwen2.5-Coder-32B has become the current state-of-the-art open-source codeLLM, with its coding abilities matching those of GPT-4o.
         | 
| 26 | 
             
            - A more comprehensive foundation for real-world applications such as **Code Agents**. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies.
         | 
| 27 |  | 
| 28 | 
             
            **This repo contains the AWQ-quantized 4-bit instruction-tuned 1.5B Qwen2.5-Coder model**, which has the following features:
         | 
|  | |
| 34 | 
             
            - Number of Layers: 28
         | 
| 35 | 
             
            - Number of Attention Heads (GQA): 12 for Q and 2 for KV
         | 
| 36 | 
             
            - Context Length: Full 32,768 tokens
         | 
|  | |
| 37 | 
             
            - Quantization: AWQ 4-bit
         | 
| 38 |  | 
| 39 | 
            +
            For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5-coder-family/), [GitHub](https://github.com/QwenLM/Qwen2.5-Coder), [Documentation](https://qwen.readthedocs.io/en/latest/), [Arxiv](https://arxiv.org/abs/2409.12186).
         | 
| 40 |  | 
| 41 | 
             
            ## Requirements
         | 
| 42 |  | 
|  | |
| 88 | 
             
            response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
         | 
| 89 | 
             
            ```
         | 
| 90 |  | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 91 |  | 
| 92 | 
             
            ## Evaluation & Performance
         | 
| 93 |  | 
| 94 | 
            +
            Detailed evaluation results are reported in this [📑 blog](https://qwenlm.github.io/blog/qwen2.5-coder-family/).
         | 
| 95 |  | 
| 96 | 
             
            For requirements on GPU memory and the respective throughput, see results [here](https://qwen.readthedocs.io/en/latest/benchmark/speed_benchmark.html).
         | 
| 97 |  | 
|  | |
| 101 |  | 
| 102 | 
             
            ```
         | 
| 103 | 
             
            @article{hui2024qwen2,
         | 
| 104 | 
            +
                  title={Qwen2. 5-Coder Technical Report},
         | 
| 105 | 
            +
                  author={Hui, Binyuan and Yang, Jian and Cui, Zeyu and Yang, Jiaxi and Liu, Dayiheng and Zhang, Lei and Liu, Tianyu and Zhang, Jiajun and Yu, Bowen and Dang, Kai and others},
         | 
| 106 | 
            +
                  journal={arXiv preprint arXiv:2409.12186},
         | 
| 107 | 
            +
                  year={2024}
         | 
| 108 | 
             
            }
         | 
| 109 | 
             
            @article{qwen2,
         | 
| 110 | 
             
                  title={Qwen2 Technical Report}, 
         | 
    	
        config.json
    CHANGED
    
    | @@ -25,11 +25,11 @@ | |
| 25 | 
             
              },
         | 
| 26 | 
             
              "rms_norm_eps": 1e-06,
         | 
| 27 | 
             
              "rope_theta": 1000000.0,
         | 
| 28 | 
            -
              "sliding_window":  | 
| 29 | 
             
              "tie_word_embeddings": true,
         | 
| 30 | 
             
              "torch_dtype": "float16",
         | 
| 31 | 
             
              "transformers_version": "4.44.1",
         | 
| 32 | 
             
              "use_cache": true,
         | 
| 33 | 
             
              "use_sliding_window": false,
         | 
| 34 | 
             
              "vocab_size": 151936
         | 
| 35 | 
            -
            }
         | 
|  | |
| 25 | 
             
              },
         | 
| 26 | 
             
              "rms_norm_eps": 1e-06,
         | 
| 27 | 
             
              "rope_theta": 1000000.0,
         | 
| 28 | 
            +
              "sliding_window": 32768,
         | 
| 29 | 
             
              "tie_word_embeddings": true,
         | 
| 30 | 
             
              "torch_dtype": "float16",
         | 
| 31 | 
             
              "transformers_version": "4.44.1",
         | 
| 32 | 
             
              "use_cache": true,
         | 
| 33 | 
             
              "use_sliding_window": false,
         | 
| 34 | 
             
              "vocab_size": 151936
         | 
| 35 | 
            +
            }
         | 
