microsoft/Phi-4-multimodal-instruct · Quantized Fine-Tunning

Apr 2

•

Has anyone tried to fine-tune the quantized model?

I am getting a RuntimeError:
/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py in requires_grad_(self, requires_grad)
2882 """
2883 for p in self.parameters():
-> 2884 p.requires_grad_(requires_grad)
2885 return self
2886

RuntimeError: only Tensors of floating point dtype can require gradients

I am updating the sample_finetune_vision with a basic quantization like:

quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(
model_name_or_path,
# device_map="cuda",
torch_dtype=torch.bfloat16,
trust_remote_code=True,
# if you do not use Ampere or later GPUs, change attention to "eager"
_attn_implementation='eager',
quantization_config=quantization_config,
)
//Delete Audio layers

model = prepare_model_for_kbit_training(model, use_gradient_checkpointing=False, gradient_checkpointing_kwargs={'use_reentrant':False})

config = LoraConfig(
task_type="CAUSAL_LM",
r=16,
lora_alpha=32,
lora_dropout=0.05,
bias="none",
inference_mode=False,
target_modules=["out_proj",], # Example: target the attention layers "q_proj", "k_proj", "v_proj"
)

Mohamed-Abbas

Apr 4

urgent

Mohamed-Abbas

Apr 4

if you find any solution please let me know

ste362

Apr 13

i have the same problem