Safetensors

Unexpected inference results after xLoRA training with PEFT API

#2
by VaterLand - opened

I am experimenting with xLoRA training using the PEFT API. As shown in my setup, I successfully assembled xLoRA and trained the model. Training and validation loss decreased as expected.
However, after training is finished, when I run inference, the output is not as expected (see below).

`Code for training

model_config = AutoConfig.from_pretrained(args.model_name_or_path)
tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path,trust_remote_code=True)

config = XLoraConfig(
    task_type="CAUSAL_LM",
    hidden_size=model_config.hidden_size,
    xlora_depth=4,
    adapters={
        "0": args.Expression,
        "1": args.Other,
        "2": args.Statement,
        "3": args.TypeSafety,
    },
)

model = AutoModelForCausalLM.from_pretrained(
    args.model_name_or_path,
    trust_remote_code=True,
    device_map="auto",
    use_cache = False,
    torch_dtype=torch.float16
)

model = get_peft_model(model, config)

`

`Training log (excerpt)

10/04/2025 12:06:53 - INFO - main - epoch = 0
10/04/2025 12:06:55 - INFO - main - Epoch: 0, Train_Step: 0/5835, Train_Loss = 1.7981
...
10/04/2025 13:17:13 - INFO - main - epoch: 0, step: 5834/5835, loss: 0.1221, lr: 4.52e-05, oom: 0, time: 4220s
10/04/2025 13:17:13 - INFO - main - -------start validation--------
10/04/2025 13:21:47 - INFO - main - Epoch = 0, Validation Loss = 0.1103391930767118
10/04/2025 13:21:48 - INFO - main - Current is Best Loss = 0.1103391930767118`

`Code for inference

tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path, trust_remote_code=True)
base_model = AutoModelForCausalLM.from_pretrained(
    args.model_name_or_path,
    trust_remote_code=True,
    # quantization_config=int8_config,
    device_map="auto",
    use_cache=False,
    # return_dict=True,
    torch_dtype=torch.float16
)

model = PeftModel.from_pretrained(base_model, args.peft_model_path)
model = model.merge_and_unload()

`

`Actual behavior (unexpected output)

1/560, Pre_EM = bad, Input_len = 325, Pre_Fix = [
'<|begin▁of▁sentence|>...<|begin▁of▁sentence|>\t<|begin▁of▁sentence|> #<|begin▁of▁sentence|>\t<|begin▁of▁sentence|> (<|begin▁of▁sentence|> xplotPlotllustrllustrllustrllustrllustrllustr...'
]`

`Expected behavior

1/560, Pre_EM = good, Input_len = 325, Pre_Fix = [
' // fix_start if (dataset != null) {\n// fix_end...'
]`

I used get_peft_model for assembling xLoRA, and training loss converged well.
During inference, I loaded the model with "PeftModel.from_pretrained" and called "merge_and_unload()".
The outputs appear corrupted / not aligned with training, as if the adapters were not applied.

Could this be a bug in PEFT’s xLoRA integration, or am I missing an additional step for proper inference with xLoRA-trained adapters?
I referred to the PEFT documentation here: https://huggingface.co/docs/peft/package_reference/xlora

LAMM: MIT Laboratory for Atomistic and Molecular Mechanics org

Hey @VaterLand ! Thanks for the issue. Can you please post it here: https://github.com/EricLBuehler/xlora?

Hey @EricB ! Thank you very much for your feedback! I’ll post this issue at the link you provided.

Sign up or log in to comment