metadata

base_model: unsloth/Qwen2-VL-2B-Instruct-unsloth-bnb-4bit
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - qwen2_vl
  - trl
  - VisionQA
license: apache-2.0
language:
  - en
datasets:
  - hamzamooraj99/AgriPath-LF16-30k

AgriPath-Qwen2-VL-2B-LoRA16

Fine-tuned Qwen2-VL-2B on AgriPath-LF16-30k for crop and disease classification. Uses LoRA (rank=16) to adapt vision and language layers.

Model Details

Base Model: Qwen2-VL-2B
Fine-tuned on: AgriPath-LF16-30k
Fine-tuning Method: LoRA (Rank=16, Alpha=16, Dropout=0)
Layers Updated: Vision, Attention, Language, MLP Modules
Optimiser: AdamW (8-bit)
Batch Size: 2 per device (Gradient Accumulation = 4)
Learning Rate: 2e-4
Training Time: 189.27 minutes (~3.15 hours)
Peak GPU Usage: 12.7GB (RTX 4080 Super)

Dataset

AgriPath-LF16-30k

30,000 images across 16 crops and 65 (crop, disease) pairs
50% lab images, 50% field images
Preprocessing:
- Images resized: max_pixels = 512x512, min_pixels = 224x224
- No additional augmentation

Training Performance

Step	Training Loss	Validation Loss
500	0.006000	0.017821
1000	0.007700	0.012377
1500	0.013800	0.009712
2000	0.003200	0.008841
2500	0.013700	0.005980

✅ Best validation loss: 0.005980 at step 2500
✅ Stable training with low overfitting

Uploaded model

Developed by: hamzamooraj99
License: apache-2.0
Finetuned from model : unsloth/Qwen2-VL-2B-Instruct-unsloth-bnb-4bit

This qwen2_vl model was trained 2x faster with Unsloth and Huggingface's TRL library.