agents-course/README · Unit 1 Bonus Notebook Problem with Multi-GPU processing

Hey there. I was experimenting around with how each GPU works with the notebook and when I tried T4 x 2 GPUs, when I was launching the training, in

trainer.train()
trainer.save_model()

I got this error:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument mat2 in method wrapper_CUDA_mm)

As I understand, I have to distribute the model, the dataset, and peft_config (I guess?) to those two GPUs.

when we were modifying the tokenizer in defined model as it is next:

model = AutoModelForCausalLM.from_pretrained(model_name,
attn_implementation='eager',
device_map="auto")

isn't the device_map='auto' doing all of the distribution for me?
2) if no, then how should I manually distribute over the GPUs and train and the model? is it i also need to put them on device the dataset and something else?

the whole notebook is not changed, so the code is no different from the one which is in repo, just for clarification, so I didn't change the model, the dataset, the lora config, to clarify it for any case. thanks in advance