Is AWQ quantization possible for this model?
#17
by
VivekMalipatel23
- opened
I am planning to run this on two 3090s with pipeline parallelism, but looks like 3090 doesn't support FP8. Can we get a AWQ quantized version of this model and other newer Qwen variants?