mac mps/cpu support

by MLLife - opened Feb 28

Discussion

MLLife

Feb 28

not working on mac m3, not even on cpu or mps

MLLife

Feb 28

i digged into this, supposed to be a issue with the vllm library? any chance of change in dependency away from this in future?

MLLife changed discussion status to closed Feb 28

MLLife changed discussion status to open Feb 28

gabegoodhart

IBM Granite org Mar 3

Hi @MLLife , for local execution on Mac with metal, you can run directly using transformers. Additionally, we've added support in llama.cpp and other runtimes based on that engine:

Ollama
- The Ollama model currently requires the vf0.5.13rc1 preview release which will move to a full release soon
- By default, the Ollama model uses Q4_K_M quantization for the LLM portion, but you can also see the other precisions in the full list of tags
LM Studio
- A collection of different precisions is available here
llama-cli (llama.cpp)
- Official GGUF conversions available here: https://huggingface.co/ibm-research/granite-vision-3.2-2b-GGUF

abrooks9944

IBM Granite org Mar 3

Hi @MLLife ! Can you please elaborate on what errors you are seeing with vLLM and how you're installing it?

vLLM does not have support for a metal backend yet, but it should work on CPU (although I don't have access to a Mac wit an m3 chip to check). vLLM support for Apple silicon is experimental though, so you may need to build it from source if you're wanting to run it there - are you building vLLM from source (i.e., similar to this)?

MLLife

Mar 4

@abrooks9944 , thanks for the pointer, now i am getting this issue, https://github.com/vllm-project/vllm/issues/13593

rafaeltuelho

Apr 7

I'm getting this error when trying on my M2 with mps device.

      7 model_path = "ibm-granite/granite-vision-3.2-2b"
      8 processor = AutoProcessor.from_pretrained(model_path)
----> 9 model = AutoModelForVision2Seq.from_pretrained(model_path).to(device)
     11 # prepare image and text prompt, using the appropriate prompt template
     13 img_path = hf_hub_download(repo_id=model_path, filename='chart.png')

File ~/dev/ai-stuff/experiments/ibm-granite/my-experiments/.venv/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py:571, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    569     if model_class.config_class == config.sub_configs.get("text_config", None):
    570         config = config.get_text_config()
--> 571     return model_class.from_pretrained(
    572         pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
    573     )
    574 raise ValueError(
    575     f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n"
    576     f"Model type should be one of {', '.join(c.__name__ for c in cls._model_mapping.keys())}."
    577 )

File ~/dev/ai-stuff/experiments/ibm-granite/my-experiments/.venv/lib/python3.11/site-packages/transformers/modeling_utils.py:279, in restore_default_torch_dtype.<locals>._wrapper(*args, **kwargs)
    277 old_dtype = torch.get_default_dtype()
    278 try:
--> 279     return func(*args, **kwargs)
    280 finally:
...
   3735 else:
-> 3736     init_contexts = [no_init_weights(), init_empty_weights()]
   3738 return init_contexts

NameError: name 'init_empty_weights' is not defined

Tried with different versions of transformers >=4.49 but none seems to work.
I'm running on Python 3.11

rafaeltuelho

Apr 7

Update.

I got it working on my M2 after installing accelerate

 pip install accelerate

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment