Remove vLLM FP8 Limitation
#2
by
simon-mo
- opened
This has been fixed as of latest v0.8.5 release π
ERROR 04-29 09:46:24 [core.py:396] ValueError("type fp8e4nv not supported in this architecture. The supported fp8 dtypes are ('fp8e4b15', 'fp8e5')")
i got this when running it on an A100..does it not use the marlin kernels by default?
jklj077
changed pull request status to
merged
I'm still encountering this error on 0.8.5.
I'm using 2 3090s with -tp 2 if that makes a difference?