llama3-1b-lora-optimum-int8 / inc_config.json
hedtorresca's picture
πŸš€ Subida de modelo cuantizado en int8 con Optimum
b101400 verified
{
"distillation": {},
"neural_compressor_version": "3.3.1",
"optimum_version": "1.24.0",
"pruning": {},
"quantization": {
"dataset_num_samples": null,
"is_static": false
},
"torch_version": "2.6.0+cu124",
"transformers_version": "4.48.3"
}