strange, why is Q3K_XL even smaller than Q3K_M?
#10
by
X5R
- opened
is it not only Q8_0 on some tensors comparing to Q3K_M?
the quant provided by others is larger than q3k_m
is it not only Q8_0 on some tensors comparing to Q3K_M?
Its normal because some layers don't have to be higher precision