Unexpected inference results after xLoRA training with PEFT API
I am experimenting with xLoRA training using the PEFT API. As shown in my setup, I successfully assembled xLoRA and trained the model. Training and validation loss decreased as expected.
However, after training is finished, when I run inference, the output is not as expected (see below).
`Code for training
model_config = AutoConfig.from_pretrained(args.model_name_or_path)
tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path,trust_remote_code=True)
config = XLoraConfig(
task_type="CAUSAL_LM",
hidden_size=model_config.hidden_size,
xlora_depth=4,
adapters={
"0": args.Expression,
"1": args.Other,
"2": args.Statement,
"3": args.TypeSafety,
},
)
model = AutoModelForCausalLM.from_pretrained(
args.model_name_or_path,
trust_remote_code=True,
device_map="auto",
use_cache = False,
torch_dtype=torch.float16
)
model = get_peft_model(model, config)
`
`Training log (excerpt)
10/04/2025 12:06:53 - INFO - main - epoch = 0
10/04/2025 12:06:55 - INFO - main - Epoch: 0, Train_Step: 0/5835, Train_Loss = 1.7981
...
10/04/2025 13:17:13 - INFO - main - epoch: 0, step: 5834/5835, loss: 0.1221, lr: 4.52e-05, oom: 0, time: 4220s
10/04/2025 13:17:13 - INFO - main - -------start validation--------
10/04/2025 13:21:47 - INFO - main - Epoch = 0, Validation Loss = 0.1103391930767118
10/04/2025 13:21:48 - INFO - main - Current is Best Loss = 0.1103391930767118`
`Code for inference
tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path, trust_remote_code=True)
base_model = AutoModelForCausalLM.from_pretrained(
args.model_name_or_path,
trust_remote_code=True,
# quantization_config=int8_config,
device_map="auto",
use_cache=False,
# return_dict=True,
torch_dtype=torch.float16
)
model = PeftModel.from_pretrained(base_model, args.peft_model_path)
model = model.merge_and_unload()
`
`Actual behavior (unexpected output)
1/560, Pre_EM = bad, Input_len = 325, Pre_Fix = [
'<|begin▁of▁sentence|>...<|begin▁of▁sentence|>\t<|begin▁of▁sentence|> #<|begin▁of▁sentence|>\t<|begin▁of▁sentence|> (<|begin▁of▁sentence|> xplotPlotllustrllustrllustrllustrllustrllustr...'
]`
`Expected behavior
1/560, Pre_EM = good, Input_len = 325, Pre_Fix = [
' // fix_start if (dataset != null) {\n// fix_end...'
]`
I used get_peft_model for assembling xLoRA, and training loss converged well.
During inference, I loaded the model with "PeftModel.from_pretrained" and called "merge_and_unload()".
The outputs appear corrupted / not aligned with training, as if the adapters were not applied.
Could this be a bug in PEFT’s xLoRA integration, or am I missing an additional step for proper inference with xLoRA-trained adapters?
I referred to the PEFT documentation here: https://huggingface.co/docs/peft/package_reference/xlora
Hey @VaterLand ! Thanks for the issue. Can you please post it here: https://github.com/EricLBuehler/xlora?