This is a 4-bit quantized version of trillionlabs/Tri-7B using Intel AutoRound.

Model Details

Base Model: trillionlabs/Tri-7B
Quantization Method: Intel AutoRound (Best Configuration)
Precision: 4-bit
Group Size: 128
Symmetric: True
Calibration Samples: 512
Tuning Iterations: 1000

Requirements

pip install transformers accelerate auto-round

License

This model inherits the Apache 2.0 license from the original Tri-7B model.

Downloads last month: 5

Safetensors

Model size

1.05B params

Tensor type

BF16

I32

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dddsaty/Tri-7B-4bit-AutoRound

Base model

trillionlabs/Tri-7B

Quantized

(3)

this model