hamzamooraj99's picture
Update README.md
9091e4c verified
metadata
base_model: unsloth/Qwen2-VL-2B-Instruct-unsloth-bnb-4bit
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - qwen2_vl
  - trl
  - VisionQA
license: apache-2.0
language:
  - en
datasets:
  - hamzamooraj99/AgriPath-LF16-30k

AgriPath-Qwen2-VL-2B-LoRA16

Fine-tuned Qwen2-VL-2B on AgriPath-LF16-30k for crop and disease classification. Uses LoRA (rank=16) to adapt vision and language layers.

Hugging Face Model

Hugging Face Dataset


Model Details

  • Base Model: Qwen2-VL-2B
  • Fine-tuned on: AgriPath-LF16-30k
  • Fine-tuning Method: LoRA (Rank=16, Alpha=16, Dropout=0)
  • Layers Updated: Vision, Attention, Language, MLP Modules
  • Optimiser: AdamW (8-bit)
  • Batch Size: 2 per device (Gradient Accumulation = 4)
  • Learning Rate: 2e-4
  • Training Time: 189.27 minutes (~3.15 hours)
  • Peak GPU Usage: 12.7GB (RTX 4080 Super)

Dataset

AgriPath-LF16-30k

  • 30,000 images across 16 crops and 65 (crop, disease) pairs
  • 50% lab images, 50% field images
  • Preprocessing:
    • Images resized: max_pixels = 512x512, min_pixels = 224x224
    • No additional augmentation

Training Performance

Step Training Loss Validation Loss
500 0.006000 0.017821
1000 0.007700 0.012377
1500 0.013800 0.009712
2000 0.003200 0.008841
2500 0.013700 0.005980

Best validation loss: 0.005980 at step 2500
Stable training with low overfitting


Uploaded model

  • Developed by: hamzamooraj99
  • License: apache-2.0
  • Finetuned from model : unsloth/Qwen2-VL-2B-Instruct-unsloth-bnb-4bit

This qwen2_vl model was trained 2x faster with Unsloth and Huggingface's TRL library.