How to run on a RTX 5090 / Blackwell ?
#27
by
celsowm
- opened
I’ve been struggling to get it running on an RTX 5090 with 32 GB of RAM. The official Docker images from Tencent don’t seem to be compatible with the Blackwell architecture. I even tried building vLLM from source via git clone, but no luck either.
Any hints?
Hi Celsown,
We've update the vLLM docker to cuda 12.4 + vLLM Official docker base image , could you check the compatible with Blackwell ?
But from your message, you have tried source code build, maybe this will not work too.
What's your error message, could you paste the full error log here ?
for a 32GB VRAM 5090, the VRAM too small to run a 80GB model even with int4 quantization.