sunzhongkai588 commited on
Commit
28d6df3
·
verified ·
1 Parent(s): db21ea4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -86
README.md CHANGED
@@ -32,6 +32,10 @@ library_name: transformers
32
 
33
  # ERNIE-4.5-300B-A47B
34
 
 
 
 
 
35
  ## ERNIE 4.5 Highlights
36
 
37
  The advanced capabilities of the ERNIE 4.5 models, particularly the MoE-based A47B and A3B series, are underpinned by several key technical innovations:
@@ -59,92 +63,6 @@ ERNIE-4.5-300B-A47B is a text MoE Post-trained model, with 300B total parameters
59
 
60
  ## Quickstart
61
 
62
- ### Model Finetuning with ERNIEKit
63
-
64
- [ERNIEKit](https://github.com/PaddlePaddle/ERNIE) is a training toolkit based on PaddlePaddle, specifically designed for the ERNIE series of open-source large models. It provides comprehensive support for scenarios such as instruction fine-tuning (SFT, LoRA) and alignment training (DPO), ensuring optimal performance.
65
-
66
- Usage Examples:
67
-
68
- ```bash
69
- # Download model
70
- huggingface-cli download baidu/ERNIE-4.5-300B-A47B-Paddle --local-dir baidu/ERNIE-4.5-300B-A47B-Paddle
71
- # SFT
72
- erniekit train examples/configs/ERNIE-4.5-300B-A47B/sft/run_sft_wint8mix_lora_8k.yaml
73
- # DPO
74
- erniekit train examples/configs/ERNIE-4.5-300B-A47B/dpo/run_dpo_wint8mix_lora_8k.yaml
75
- ```
76
-
77
- For more detailed examples, including SFT with LoRA, multi-GPU configurations, and advanced scripts, please refer to the examples folder within the [ERNIEKit](https://github.com/PaddlePaddle/ERNIE) repository.
78
-
79
-
80
-
81
-
82
- ### Using FastDeploy
83
-
84
- Service deployment can be quickly completed using FastDeploy in the following command. For more detailed usage instructions, please refer to the [FastDeploy Repository](https://github.com/PaddlePaddle/FastDeploy).
85
-
86
- **Note**: To deploy on a configuration with 4 GPUs each having at least 80G of memory, specify ```--quantization wint4```. If you specify ```--quantization wint8```, then resources for 8 GPUs are required.
87
-
88
- ```bash
89
- python -m fastdeploy.entrypoints.openai.api_server \
90
- --model baidu/ERNIE-4.5-300B-A47B-Paddle \
91
- --port 8180 \
92
- --metrics-port 8181 \
93
- --quantization wint4 \
94
- --tensor-parallel-size 8 \
95
- --engine-worker-queue-port 8182 \
96
- --max-model-len 32768 \
97
- --max-num-seqs 32
98
- ```
99
-
100
- To deploy the W4A8C8 quantized version using FastDeploy, you can run the following command.
101
-
102
- ```bash
103
- python -m fastdeploy.entrypoints.openai.api_server \
104
- --model baidu/ERNIE-4.5-300B-A47B-W4A8C8-TP4-Paddle \
105
- --port 8180 \
106
- --metrics-port 8181 \
107
- --engine-worker-queue-port 8182 \
108
- --tensor-parallel-size 4 \
109
- --max-model-len 32768 \
110
- --max-num-seqs 32
111
- ```
112
-
113
- To deploy the WINT2 quantized version using FastDeploy on a single 141G GPU, you can run the following command.
114
-
115
- ```bash
116
- python -m fastdeploy.entrypoints.openai.api_server \
117
- --model "baidu/ERNIE-4.5-300B-A47B-2Bits-Paddle" \
118
- --port 8180 \
119
- --metrics-port 8181 \
120
- --engine-worker-queue-port 8182 \
121
- --tensor-parallel-size 1 \
122
- --max-model-len 32768 \
123
- --max-num-seqs 128
124
- ```
125
-
126
- The following contains a code snippet illustrating how to use ERNIE-4.5-300B-A47B-FP8 generate content based on given inputs.
127
-
128
- ```python
129
- from fastdeploy import LLM, SamplingParams
130
-
131
- prompts = [
132
- "Hello, my name is",
133
- ]
134
-
135
- sampling_params = SamplingParams(temperature=0.8, top_p=0.95, max_tokens=128)
136
-
137
- model = "baidu/ERNIE-4.5-300B-A47B-FP8-Paddle"
138
- llm = LLM(model=model, tensor_parallel_size=8, max_model_len=8192, num_gpu_blocks_override=1024, engine_worker_queue_port=9981)
139
-
140
- outputs = llm.generate(prompts, sampling_params)
141
-
142
- for output in outputs:
143
- prompt = output.prompt
144
- generated_text = output.outputs.text
145
- print("generated_text", generated_text)
146
- ```
147
-
148
  ### Using `transformers` library
149
 
150
  **Note**: Before using the model, please ensure you have the `transformers` library installed (version 4.50.0 or higher)
 
32
 
33
  # ERNIE-4.5-300B-A47B
34
 
35
+ > [!NOTE]
36
+ > Note: "**-Paddle**" models use [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) weights, while "**-PT**" models use Transformer-style PyTorch weights.
37
+
38
+
39
  ## ERNIE 4.5 Highlights
40
 
41
  The advanced capabilities of the ERNIE 4.5 models, particularly the MoE-based A47B and A3B series, are underpinned by several key technical innovations:
 
63
 
64
  ## Quickstart
65
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
66
  ### Using `transformers` library
67
 
68
  **Note**: Before using the model, please ensure you have the `transformers` library installed (version 4.50.0 or higher)