metadata
base_model: unsloth/Qwen2-VL-2B-Instruct-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2_vl
- trl
- VisionQA
license: apache-2.0
language:
- en
datasets:
- hamzamooraj99/AgriPath-LF16-30k
AgriPath-Qwen2-VL-2B-LoRA16
Fine-tuned Qwen2-VL-2B on AgriPath-LF16-30k for crop and disease classification. Uses LoRA (rank=16) to adapt vision and language layers.
Model Details
- Base Model: Qwen2-VL-2B
- Fine-tuned on: AgriPath-LF16-30k
- Fine-tuning Method: LoRA (Rank=16, Alpha=16, Dropout=0)
- Layers Updated: Vision, Attention, Language, MLP Modules
- Optimiser: AdamW (8-bit)
- Batch Size: 2 per device (Gradient Accumulation = 4)
- Learning Rate: 2e-4
- Training Time: 189.27 minutes (~3.15 hours)
- Peak GPU Usage: 12.7GB (RTX 4080 Super)
Dataset
AgriPath-LF16-30k
- 30,000 images across 16 crops and 65 (crop, disease) pairs
- 50% lab images, 50% field images
- Preprocessing:
- Images resized:
max_pixels = 512x512
,min_pixels = 224x224
- No additional augmentation
- Images resized:
Training Performance
Step | Training Loss | Validation Loss |
---|---|---|
500 | 0.006000 | 0.017821 |
1000 | 0.007700 | 0.012377 |
1500 | 0.013800 | 0.009712 |
2000 | 0.003200 | 0.008841 |
2500 | 0.013700 | 0.005980 |
✅ Best validation loss: 0.005980 at step 2500
✅ Stable training with low overfitting
Uploaded model
- Developed by: hamzamooraj99
- License: apache-2.0
- Finetuned from model : unsloth/Qwen2-VL-2B-Instruct-unsloth-bnb-4bit
This qwen2_vl model was trained 2x faster with Unsloth and Huggingface's TRL library.