Difference with Block-wise Int8?

#1
by leo98xh - opened
meituan org

Could you explain the difference with Block-wise Int8?

meituan org

The main difference is that they have different quantization granularity. In block-wise int8, the elements in a block size 128x128 share the same quantization scale. In channel-wise int8, the elements in a column share the same quantization scale.

pkumc changed discussion status to closed

Sign up or log in to comment