ChaoHuangCS commited on
Commit
8ee4b34
·
verified ·
1 Parent(s): 5f1c818

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -32
README.md CHANGED
@@ -14,45 +14,16 @@ pipeline_tag: image-to-text
14
 
15
  This is a fine-tuned version of Qwen2.5-VL for enhanced reasoning capabilities, specifically optimized for multimodal reasoning tasks.
16
 
17
- ## Model Details
18
-
19
- - **Base Model**: qwen2.5-vl
20
- - **Model Type**: Vision-Language Model
21
- - **Task**: Multimodal reasoning and visual question answering
22
- - **Fine-tuning**: Custom training on reasoning datasets
23
-
24
- ## Model Files
25
-
26
- This repository contains only the essential files for inference:
27
-
28
- ### Core Model Files
29
- - `config.json`: Model configuration
30
- - `generation_config.json`: Text generation configuration
31
- - `model-*.safetensors`: Model weights in SafeTensors format
32
- - `model.safetensors.index.json`: Model weights index
33
-
34
- ### Tokenizer Files
35
- - `tokenizer.json`: Tokenizer configuration
36
- - `tokenizer_config.json`: Tokenizer settings
37
- - `vocab.json`: Vocabulary file
38
- - `merges.txt`: BPE merge rules
39
- - `added_tokens.json`: Additional tokens
40
- - `special_tokens_map.json`: Special token mappings
41
-
42
- ### Vision Processing
43
- - `preprocessor_config.json`: Image preprocessing configuration
44
- - `chat_template.json`: Chat template for conversations
45
-
46
  ## Usage
47
 
48
  ```python
49
- from transformers import AutoModelForCausalLM, AutoProcessor
50
  import torch
51
 
52
  model_id = "ChaoHuangCS/DRIFT-VL-7B"
53
 
54
  # Load model and processor
55
- model = AutoModelForCausalLM.from_pretrained(
56
  model_id,
57
  torch_dtype=torch.float16,
58
  device_map="auto",
@@ -88,6 +59,7 @@ print(response)
88
 
89
  This model was fine-tuned using:
90
  - **Base Model**: Qwen2.5-VL
 
91
  - **Training Method**: Custom reasoning-focused fine-tuning
92
  - **Dataset**: Multimodal reasoning datasets
93
  - **Architecture**: Preserves original Qwen2.5-VL architecture
@@ -102,7 +74,7 @@ The model has been optimized for:
102
 
103
  ## Citation
104
 
105
- If you use this model, please cite the original Qwen2.5-VL paper and mention this fine-tuned version.
106
 
107
  ## License
108
 
 
14
 
15
  This is a fine-tuned version of Qwen2.5-VL for enhanced reasoning capabilities, specifically optimized for multimodal reasoning tasks.
16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  ## Usage
18
 
19
  ```python
20
+ from transformers import Qwen2_5_VLForConditionalGeneration, AutoProcessor
21
  import torch
22
 
23
  model_id = "ChaoHuangCS/DRIFT-VL-7B"
24
 
25
  # Load model and processor
26
+ model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
27
  model_id,
28
  torch_dtype=torch.float16,
29
  device_map="auto",
 
59
 
60
  This model was fine-tuned using:
61
  - **Base Model**: Qwen2.5-VL
62
+ - **Merged Model**: DeepSeek-R1
63
  - **Training Method**: Custom reasoning-focused fine-tuning
64
  - **Dataset**: Multimodal reasoning datasets
65
  - **Architecture**: Preserves original Qwen2.5-VL architecture
 
74
 
75
  ## Citation
76
 
77
+ If you use this model, please cite our paper.
78
 
79
  ## License
80