Notice

The quantized model has been found with performance issues. Please check new uploaded model files in Dream-v0-Instruct-7B-4bit

Quantized Dream-v0-Instruct-7B

This repository contains a 4-bit quantized version of the Dream-v0-Instruct-7B model, optimized for memory-efficient inference while maintaining good performance.

Model Details

  • Base Model: Dream-v0-Instruct-7B
  • Quantization: 4-bit quantization using bitsandbytes
  • Last Updated: 2025-04-07

Quantization Configuration

The model uses the following quantization settings:

from transformers import BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype="float16"
)

Usage

Here's how to load and use the quantized model:

from transformers import AutoModel, AutoTokenizer, BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype="float16"
)

model = AutoModel.from_pretrained(
    "Rainnighttram/Dream-7B-bnb-4bit",
    quantization_config=quantization_config,
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
    "Rainnighttram/Dream-7B-bnb-4bit",
    trust_remote_code=True
)

Requirements

  • transformers==4.46.2
  • bitsandbytes
  • torch==2.5.1
  • Python 3.11+
  • accelerate>=0.26.0

Original Model

This is a quantized version of Dream-org/Dream-v0-Instruct-7B. Please refer to the original model card for more details about the base model's capabilities and limitations.

License

This model inherits its license from the original Dream-v0-Instruct-7B model. Please refer to the original model repository for licensing information.

Downloads last month
4
Safetensors
Model size
4.45B params
Tensor type
FP16
F32
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for Rainnighttram/Dream-7B-bnb-4bit

Quantized
(2)
this model