tiiuae/Falcon-H1-3B-Instruct · Issues with vllm hosting

29 days ago

I installed vllm from source

Name: vllm
Version: 0.9.1.dev27+geca18691d.precompiled
Summary: A high-throughput and memory-efficient inference and serving engine for LLMs
Home-page: https://github.com/vllm-project/vllm
Author: vLLM Team

But the model failed to be loaded with issues

ERROR 05-21 12:24:16 [registry.py:362] ImportError: /opt/conda/lib/python3.11/site-packages/flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN3c105ErrorC2ENS_14SourceLocationESs

ValueError: Model architectures ['FalconH1ForCausalLM'] failed to be inspected. Please check the logs for more details.

ybelkada

Technology Innovation Institute org 29 days ago

Hi @raghavgg
How did you install from source ?
Can you try to build vLLM following these instructions: https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html#build-wheel-from-source

DhiyaEddine

Technology Innovation Institute org 29 days ago

This issue should be related to a mismatch between cuda version and torch version.

raghavgg

29 days ago

•

edited 29 days ago

okay,I will try setting up from a different pytorch docker and let you know

raghavgg

29 days ago

i updated all the packages and the model loaded but now i am getting

TypeError: prepare_mamba2_metadata() got an unexpected keyword argument 'input_ids'

DhiyaEddine

Technology Innovation Institute org 29 days ago

•

edited 29 days ago

Many Thanks @raghavgg

we raised a PR to fix that with another issue as well.

we checked locally and vllm generations are now good after the PR changes.