vllm support a100
#2
by
HuggingLianWang
- opened
Can this model be served directly using vllm on 8xA100(80GB)?
HuggingLianWang
changed discussion title from
vllm support
to vllm support a100
Yes but it will run with like 3.7 tokens per second.
Yes but it will run with like 3.7 tokens per second.
Thank you very much , we will try