aarbelle commited on
Commit
fdb5d25
·
verified ·
1 Parent(s): fb78cf4

Update README.md

Browse files

Add disclaimer about model size

Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -158,6 +158,12 @@ The architecture of granite-vision-3.1-2b-preview consists of the following comp
158
 
159
  We built upon LlaVA (https://llava-vl.github.io) to train our model. We use multi-layer encoder features and a denser grid resolution in AnyRes to enhance the model's ability to understand nuanced visual content, which is essential for accurately interpreting document images.
160
 
 
 
 
 
 
 
161
 
162
  **Training Data:**
163
 
 
158
 
159
  We built upon LlaVA (https://llava-vl.github.io) to train our model. We use multi-layer encoder features and a denser grid resolution in AnyRes to enhance the model's ability to understand nuanced visual content, which is essential for accurately interpreting document images.
160
 
161
+ _Note:_
162
+
163
+ We denote our model as Granite-Vision-3.1-2B-Preview, where the version (3.1) and size (2B) of the base large language model
164
+ are explicitly indicated. However, when considering the integrated vision encoder and projector, the total parameter count of our
165
+ model increases to 3 billion parameters.
166
+
167
 
168
  **Training Data:**
169