YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

MiniCPM-V-4.5-abliterated-int4

This is a 4-bit quantized version of huihui-ai/Huihui-MiniCPM-V-4_5-abliterated using bitsandbytes NF4 quantization.

Model Details

  • Base Model: huihui-ai/Huihui-MiniCPM-V-4_5-abliterated
  • Quantization: 4-bit (NF4) using bitsandbytes
  • Model Size: ~6.4 GB (85.8% reduction from original 45.28 GB)
  • Compute dtype: float16
  • Double quantization: Disabled for better performance

Quantization Configuration

{
  "load_in_4bit": true,
  "bnb_4bit_compute_dtype": "float16",
  "bnb_4bit_quant_type": "nf4",
  "bnb_4bit_use_double_quant": false,
  "llm_int8_skip_modules": ["out_proj", "kv_proj", "lm_head"],
  "quant_method": "bitsandbytes"
}

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "wavespeed/MiniCPM-V-4_5-abliterated-int4",
    device_map="auto",
    trust_remote_code=True,
    torch_dtype=torch.float16
)

tokenizer = AutoTokenizer.from_pretrained(
    "wavespeed/MiniCPM-V-4_5-abliterated-int4",
    trust_remote_code=True
)

Requirements

  • transformers
  • bitsandbytes
  • torch
  • accelerate

Note on File Size

The model files appear large (~6.4 GB) despite being 4-bit quantized. This is expected behavior for bitsandbytes quantization, which stores weights in a format that enables efficient on-the-fly dequantization during inference. The actual memory usage during runtime will be significantly lower than the file size suggests.

License

Same as the original model - please refer to the base model's license.

Acknowledgments

Downloads last month
52
Safetensors
Model size
8.7B params
Tensor type
F32
F16
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support