Update README.md
Browse files
README.md
CHANGED
@@ -169,10 +169,14 @@ Exporting to ExecuTorch requires you clone and install [ExecuTorch](https://gith
|
|
169 |
|
170 |
|
171 |
## Convert quantized checkpoint to ExecuTorch's format
|
|
|
|
|
172 |
```
|
173 |
python -m executorch.examples.models.phi_4_mini.convert_weights phi4-mini-8dq4w.bin phi4-mini-8dq4w-converted.bin
|
174 |
```
|
175 |
|
|
|
|
|
176 |
## Export to an ExecuTorch *.pte with XNNPACK
|
177 |
```
|
178 |
PARAMS="executorch/examples/models/phi_4_mini/config.json"
|
@@ -188,7 +192,7 @@ python -m executorch.examples.models.llama.export_llama \
|
|
188 |
```
|
189 |
|
190 |
## Running in a mobile app
|
191 |
-
The
|
192 |
On iPhone 15 Pro, the model runs at 17.3 tokens/sec and uses 3206 Mb of memory.
|
193 |
|
194 |

|
|
|
169 |
|
170 |
|
171 |
## Convert quantized checkpoint to ExecuTorch's format
|
172 |
+
|
173 |
+
ExecuTorch expects the checkpoint keys to have certain names in order to export. The following script converts the quantized checkpoint from Hugging Face to the one ExecuTorch expects.
|
174 |
```
|
175 |
python -m executorch.examples.models.phi_4_mini.convert_weights phi4-mini-8dq4w.bin phi4-mini-8dq4w-converted.bin
|
176 |
```
|
177 |
|
178 |
+
Once the checkpoint is converted, we can export to ExecuTorch's PTE format.
|
179 |
+
|
180 |
## Export to an ExecuTorch *.pte with XNNPACK
|
181 |
```
|
182 |
PARAMS="executorch/examples/models/phi_4_mini/config.json"
|
|
|
192 |
```
|
193 |
|
194 |
## Running in a mobile app
|
195 |
+
The PTE file can be run with ExecuTorch. See the [instructions](https://pytorch.org/executorch/main/llm/llama-demo-ios.html) for doing this in iOS.
|
196 |
On iPhone 15 Pro, the model runs at 17.3 tokens/sec and uses 3206 Mb of memory.
|
197 |
|
198 |

|