baidu
/

ERNIE-4.5-VL-28B-A3B-PT

Image-Text-to-Text

ernie4_5_moe_vl

feature-extraction

Model card Files Files and versions

Update README.md

#3

by sunzhongkai588 - opened 25 days ago

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

Files changed (1) hide show

README.md +3 -52

README.md CHANGED Viewed

@@ -32,6 +32,9 @@ library_name: transformers
 # ERNIE-4.5-VL-28B-A3B
 ## ERNIE 4.5 Highlights
 The advanced capabilities of the ERNIE 4.5 models, particularly the MoE-based A47B and A3B series, are underpinned by several key technical innovations:
@@ -62,58 +65,6 @@ ERNIE-4.5-VL-28B-A3B is a multimodal MoE Chat model, with 28B total parameters a
 ## Quickstart
-### FastDeploy Inference
-Quickly deploy services using FastDeploy as shown below. For more detailed usage, refer to the [FastDeploy GitHub Repository](https://github.com/PaddlePaddle/FastDeploy).
-**Note**: For single-card deployment, at least 80GB of GPU memory is required.
-```bash
-python -m fastdeploy.entrypoints.openai.api_server \
-       --model baidu/ERNIE-4.5-VL-28B-A3B-Paddle \
-       --port 8180 \
-       --metrics-port 8181 \
-       --engine-worker-queue-port 8182 \
-       --max-model-len 32768 \
-       --enable-mm \
-       --reasoning-parser ernie-45-vl \
-       --max-num-seqs 32
-```
-The ERNIE-4.5-VL model supports enabling or disabling thinking mode through request parameters.
-#### Enable Thinking Mode
-```bash
-curl -X POST "http://0.0.0.0:8180/v1/chat/completions" \
--H "Content-Type: application/json" \
--d '{
-  "messages": [
-    {"role": "user", "content": [
-      {"type": "image_url", "image_url": {"url": "https://paddlenlp.bj.bcebos.com/datasets/paddlemix/demo_images/example2.jpg"}},
-      {"type": "text", "text": "Descript this image"}
-    ]}
-  ],
-  "metadata": {"enable_thinking": true}
-}'
-```
-#### Disable Thinking Mode
-```bash
-curl -X POST "http://0.0.0.0:8180/v1/chat/completions" \
--H "Content-Type: application/json" \
--d '{
-  "messages": [
-    {"role": "user", "content": [
-      {"type": "image_url", "image_url": {"url": "https://paddlenlp.bj.bcebos.com/datasets/paddlemix/demo_images/example2.jpg"}},
-      {"type": "text", "text": "Descript this image"}
-    ]}
-  ],
-  "metadata": {"enable_thinking": false}
-}'
-```
 ### Using `transformers` library
 Here is an example of how to use the transformers library for inference:

 # ERNIE-4.5-VL-28B-A3B
+> [!NOTE]
+> Note: "**-Paddle**" models use [PaddlePaddle](https://github.com/PaddlePaddle/Paddle) weights, while "**-PT**" models use Transformer-style PyTorch weights.
 ## ERNIE 4.5 Highlights
 The advanced capabilities of the ERNIE 4.5 models, particularly the MoE-based A47B and A3B series, are underpinned by several key technical innovations:
 ## Quickstart
 ### Using `transformers` library
 Here is an example of how to use the transformers library for inference: