Mac mps support?

#11
by pylotlight - opened

Does this have mps acceleration?

OpenBMB org

Not yet, but we will support it as soon as possible.

What are the incompatibility issues preventing MPS Support for oxCPM-0.5B?

OpenBMB org

What are the incompatibility issues preventing MPS Support for oxCPM-0.5B?

It looks like PyTorch has some issues with GQA support when using MPS. We are still working on finding a solution.

Thanks! that matches my experience in converting the model to coreml where SDPA/GQA on MPS produce artifacts.
The Mitigation I put in place in converting to coreml was to enable eager attention with a flag of export VOXCPM_FORCE_EAGER_ATTENTION=1 to force float32 eager attention path, which avoided SDPA anomalies on MPS. I also used float32 on MPS end-to-end to improve numerical stability (instead of bf16/fp16)

OpenBMB org

@pylotlight We've updated the code to support MPS inference. Please try cloning the repository to install it and then test whether you can run inference using MPS.

I just tested this on an M3 pro chip and it worked great. 13-15 it/s

Sign up or log in to comment