Is AWQ quantization possible for this model?

#17
by VivekMalipatel23 - opened

I am planning to run this on two 3090s with pipeline parallelism, but looks like 3090 doesn't support FP8. Can we get a AWQ quantized version of this model and other newer Qwen variants?

Sign up or log in to comment