jiaqiz commited on
Commit
065b1ae
·
verified ·
1 Parent(s): a20fc0d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -127,12 +127,12 @@ We developed this model using Llama-3.3-Nemotron-Super-49B-v1 as its foundation.
127
  **Supported Hardware Microarchitecture Compatibility:** <br>
128
  * NVIDIA Ampere <br>
129
  * NVIDIA Hopper <br>
130
- * NVIDIA Turing <br>
131
  **Supported Operating System(s):** Linux <br>
132
 
133
  ## Quick Start
134
 
135
- We recommend serving the model with vLLM.
136
 
137
  ```
138
  pip install vllm==0.8.3
@@ -276,7 +276,7 @@ v1.0
276
 
277
 
278
  # Inference:
279
- **Engine:** [Triton](https://developer.nvidia.com/triton-inference-server) <br>
280
  **Test Hardware:** H100, A100 80GB, A100 40GB <br>
281
 
282
 
 
127
  **Supported Hardware Microarchitecture Compatibility:** <br>
128
  * NVIDIA Ampere <br>
129
  * NVIDIA Hopper <br>
130
+
131
  **Supported Operating System(s):** Linux <br>
132
 
133
  ## Quick Start
134
 
135
+ We recommend serving the model with vLLM. You can use the model with 2 or more 80GB GPUs (NVIDIA Ampere or newer) with at least 100GB of free disk space to accomodate the download.
136
 
137
  ```
138
  pip install vllm==0.8.3
 
276
 
277
 
278
  # Inference:
279
+ **Engine:** vLLM <br>
280
  **Test Hardware:** H100, A100 80GB, A100 40GB <br>
281
 
282