What's the quantization format of 4bit / 8bit?

#39
by WatermelonEast - opened

or it means fp4 / fp8?

Google org

Hi @WatermelonEast ,

The quantization format for 8-bit precision is int8, and for 4-bit precision, it is int4. To enable these quantization formats, you can use the following lines of code:

quantization_config = BitsAndBytesConfig(load_in_8bit=True)
quantization_config = BitsAndBytesConfig(load_in_4bit=True)

Thank you.

Sign up or log in to comment