Please help with error: Mistral-Small is not running on MacOs with CPU M2 Silicon. With Assert Error

#34
by NickolasCh - opened

Both Mistral-Small (Base and Instruct) are not starting on Mac OS with M2 CPU (CPU only)
CPU info:
Model Name: MacBook Pro
Model Identifier: Mac14,6
Model Number: MNWE3LL/A
Chip: Apple M2 Max
Total Number of Cores: 12 (8 performance and 4 efficiency)
Memory: 32 GB

Version:
System Version: macOS 14.6.1 (23G93)
Kernel Version: Darwin 23.6.0

Built-in GPU info:
Chipset Model: Apple M2 Max
Type: GPU
Bus: Built-In
Total Number of Cores: 38

Installed according to instruction:
pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly --upgrade
vllm serve mistralai/Mistral-Small-3.1-24B-Instruct-2503 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --limit_mm_per_prompt 'image=10' --tensor-parallel-size 2

"vllm serve..." command causes Crash:
INFO 03-21 10:42:03 [shm_broadcast.py:258] vLLM message queue communication handle: Handle(local_reader_ranks=[1], buffer_handle=(1, 4194304, 6, 'psm_9afa231b'), local_subscribe_addr='ipc:///var/folders/r2/vsgj3nzd139181d32rn8r9xm0000gn/T/211e11d6-2c7d-4007-a21b-eb22ae983084', remote_subscribe_addr=None, remote_addr_ipv6=False)
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] Exception in worker VllmWorkerProcess while processing method init_device.
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] Traceback (most recent call last):
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] File "........./py_vllm/lib/python3.12/site-packages/vllm/executor/multiproc_worker_utils.py", line 232, in _run_worker_process
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] output = run_method(worker, method, args, kwargs)
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] File "........./py_vllm/lib/python3.12/site-packages/vllm/utils.py", line 2216, in run_method
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] return func(*args, **kwargs)
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] ^^^^^^^^^^^^^^^^^^^^^
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] File "........./py_vllm/lib/python3.12/site-packages/vllm/worker/worker_base.py", line 604, in init_device
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] self.worker.init_device() # type: ignore
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] ^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] File "........./py_vllm/lib/python3.12/site-packages/vllm/worker/cpu_worker.py", line 220, in init_device
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] self.init_distributed_environment()
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] File "........./py_vllm/lib/python3.12/site-packages/vllm/worker/cpu_worker.py", line 383, in init_distributed_environment
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] ensure_model_parallel_initialized(
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] File "........./py_vllm/lib/python3.12/site-packages/vllm/distributed/parallel_state.py", line 1005, in ensure_model_parallel_initialized
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] initialize_model_parallel(tensor_model_parallel_size,
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] File "........./py_vllm/lib/python3.12/site-packages/vllm/distributed/parallel_state.py", line 932, in initialize_model_parallel
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] _TP = init_model_parallel_group(group_ranks,
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] File "........./py_vllm/lib/python3.12/site-packages/vllm/distributed/parallel_state.py", line 730, in init_model_parallel_group
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] return GroupCoordinator(
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] ^^^^^^^^^^^^^^^^^
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] File "........./py_vllm/lib/python3.12/site-packages/vllm/distributed/parallel_state.py", line 218, in init
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] self.mq_broadcaster = MessageQueue.create_from_process_group(
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] File "........./py_vllm/lib/python3.12/site-packages/vllm/distributed/device_communicators/shm_broadcast.py", line 528, in create_from_process_group
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] buffer_io = MessageQueue.create_from_handle(handle, group_rank)
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] File "........./py_vllm/lib/python3.12/site-packages/vllm/distributed/device_communicators/shm_broadcast.py", line 273, in create_from_handle
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] self.buffer = ShmRingBuffer(*handle.buffer_handle)
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] File "........./py_vllm/lib/python3.12/site-packages/vllm/distributed/device_communicators/shm_broadcast.py", line 129, in init
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] self.shared_memory.size == self.total_bytes_of_buffer)
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(VllmWorkerProcess pid=47716) ERROR 03-21 10:42:03 [multiproc_worker_utils.py:238] AssertionError

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment