Qwen2.5-VL-3B-SFT / README.md
shLLLLL's picture
Update README.md
18ce173 verified

Qwen2.5-VL-3B-SFT

Qwen2.5-VL-3B-SFT-wo_single_turn-7k

Training Data: PyVision-SFT

Filtered single turn trajactory

learning_rate: 1.0e-5
lr_scheduler_type: cosine

plot

Qwen2.5-VL-3B-SFT-wo_single_turn-7k-train-1epoch-0.1warm

num_train_epochs: 1.0
warmup_ratio: 0.1
equivalent_batchsize: 16

Qwen2.5-VL-3B-SFT-wo_single_turn-7k-train-1epoch-0.2warm

num_train_epochs: 1
warmup_ratio: 0.2
equivalent_batchsize: 16

Qwen2.5-VL-3B-SFT-wo_single_turn-7k-train-1epoch-8bs-0.1warm

num_train_epochs: 1
warmup_ratio: 0.1
equivalent_batchsize: 8

Qwen2.5-VL-3B-SFT-wo_single_turn-7k-train-5epoch-0.1warm

num_train_epochs: 5
warmup_ratio: 0.1
equivalent_batchsize: 16