Gemma 3 Object Detection
Collection
3 items
β’
Updated
β’
3
This model is a fine-tuned version of Gemma 3 4B for license plate object detection.
This model aims to prove that VLMs NOT previously trained for object detection and without previous knowledge of location tokens (<locXXXX>
) can still be fine tuned for object detection out of the box. This is an experimental model.
Follow these steps to configure, train, and run predictions (using the code repository):
config.py
): All major parameters are centralized here. Before running any script, review and adjust these settings as needed.train.py
): This script handles the fine-tuning process.infer.py
): Run this to visualize object detection.If you use our work, please cite us:
@misc{gosthipaty_gemma3_object_detection_2025,
author = {Aritra Roy Gosthipaty and Sergio Paniego},
title = {Fine-tuning Gemma 3 for Object Detection},
year = {2025},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/ariG23498/gemma3-object-detection.git}}
}
Base model
google/gemma-3-4b-pt