Update README.md
Browse files
README.md
CHANGED
@@ -29,6 +29,8 @@ datasets:
|
|
29 |
|
30 |
> The **DeepCaption-VLA-7B** model is a fine-tuned version of **Qwen2.5-VL-7B-Instruct**, tailored for **Image Captioning** and **Vision Language Attribution**. This variant is designed to generate precise, highly descriptive captions with a focus on **defining visual properties, object attributes, and scene details** across a wide spectrum of images and aspect ratios.
|
31 |
|
|
|
|
|
32 |
# Key Highlights
|
33 |
|
34 |
1. **Vision Language Attribution (VLA):** Specially fine-tuned to attribute and define visual properties of objects, scenes, and environments.
|
|
|
29 |
|
30 |
> The **DeepCaption-VLA-7B** model is a fine-tuned version of **Qwen2.5-VL-7B-Instruct**, tailored for **Image Captioning** and **Vision Language Attribution**. This variant is designed to generate precise, highly descriptive captions with a focus on **defining visual properties, object attributes, and scene details** across a wide spectrum of images and aspect ratios.
|
31 |
|
32 |
+
[](https://github.com/PRITHIVSAKTHIUR/Multimodal-Outpost-Notebooks/blob/main/DeepCaption-VLA-7B%5B4bit%20-%20notebook%20demo%5D/DeepCaption-VLA-7B.ipynb)
|
33 |
+
|
34 |
# Key Highlights
|
35 |
|
36 |
1. **Vision Language Attribution (VLA):** Specially fine-tuned to attribute and define visual properties of objects, scenes, and environments.
|