nielsr HF Staff commited on
Commit
598277c
·
verified ·
1 Parent(s): 8499fb7

Update pipeline tag, add library_name, and links to paper/code

Browse files

This PR updates the model card to reflect the correct `pipeline_tag` (`image-text-to-text`) as the model processes both image and text inputs to generate a text response. It also adds `library_name: transformers` as the model is compatible with the Hugging Face Transformers library.

Additionally, explicit links to the paper and the GitHub repository are added for improved discoverability.

Files changed (1) hide show
  1. README.md +6 -3
README.md CHANGED
@@ -1,18 +1,21 @@
1
  ---
2
- license: mit
3
  base_model: qwen2.5-vl
 
 
4
  tags:
5
  - vision-language-model
6
  - multimodal
7
  - reasoning
8
  - fine-tuned
9
  - qwen
10
- pipeline_tag: image-to-text
11
  ---
12
 
13
  # DRIFT
14
 
15
  This is a fine-tuned version of Qwen2.5-VL for enhanced reasoning capabilities, specifically optimized for multimodal reasoning tasks.
 
 
16
 
17
  ## Usage
18
 
@@ -78,4 +81,4 @@ If you use this model, please cite our paper.
78
 
79
  ## License
80
 
81
- This model is released under the MIT license.
 
1
  ---
 
2
  base_model: qwen2.5-vl
3
+ license: mit
4
+ pipeline_tag: image-text-to-text
5
  tags:
6
  - vision-language-model
7
  - multimodal
8
  - reasoning
9
  - fine-tuned
10
  - qwen
11
+ library_name: transformers
12
  ---
13
 
14
  # DRIFT
15
 
16
  This is a fine-tuned version of Qwen2.5-VL for enhanced reasoning capabilities, specifically optimized for multimodal reasoning tasks.
17
+ The model is presented in the paper [Directional Reasoning Injection for Fine-Tuning MLLMs](https://huggingface.co/papers/2510.15050).
18
+ The code and further details can be found on the GitHub repository: https://github.com/WikiChao/DRIFT
19
 
20
  ## Usage
21
 
 
81
 
82
  ## License
83
 
84
+ This model is released under the MIT license.