mistral-coreml / README.md
pcuenq's picture
pcuenq HF staff
Update README.md (#8)
ef56121 verified
---
library_name: coreml
license: apache-2.0
tags:
- text-generation
---
# Mistral-7B-Instruct-v0.3 + CoreML
> [!IMPORTANT]
> ❗ This repo requires the use of the macOS Sequoia (15) Developer Beta to utilize the latest and greatest CoreML has to offer!
> Sign up for the Apple Beta Software Program [here](https://beta.apple.com/en/) to get access.
> Check out the companion blog post to learn more about what's new in iOS 18 & macOS 15 [here](https://hf.co/blog/mistral-coreml).
This repo contains [Mistral 7B Instruct v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) converted to CoreML in both FP16 & Int4 precision.
Mistral-7B-Instruct-v0.3 is an instruct fine-tuned version of the Mistral-7B-v0.3 by Mistral AI.
Mistral-7B-v0.3 has the following changes compared to v0.2 model:
- Extended vocabulary to 32768
- Supports v3 Tokenizer
- Supports function calling
To learn more about the model, we recommend looking at its documentation and original model card
## Download
Install `huggingface-cli`
```bash
pip install -U "huggingface_hub[cli]"
```
To download one of the `.mlpackage` folders to the `models` directory:
```bash
huggingface-cli download \
--local-dir models \
--local-dir-use-symlinks False \
apple/mistral-coreml \
--include "StatefulMistral7BInstructInt4.mlpackage/*"
```
To download everything, remove the `--include` argument.
## Integrate in Swift apps
The [`huggingface/swift-chat`](https://github.com/huggingface/swift-chat) repository contains a demo app to get you up and running quickly!
You can integrate the model right into your Swift apps using the `preview` branch of [`huggingface/swift-transformers`](https://github.com/huggingface/swift-transformers/tree/preview)
## Limitations
The Mistral 7B Instruct model is a quick demonstration that the base model can be easily fine-tuned to achieve compelling performance.
It does not have any moderation mechanisms. We're looking forward to engaging with the community on ways to
make the model finely respect guardrails, allowing for deployment in environments requiring moderated outputs.