Upload HfMoondream
Browse files- README.md +1 -1
- model.safetensors +2 -2
- text.py +1 -1
README.md
CHANGED
@@ -7,7 +7,7 @@ Moondream is a small vision language model designed to run efficiently everywher
|
|
7 |
|
8 |
[Website](https://moondream.ai/) / [Demo](https://moondream.ai/playground) / [GitHub](https://github.com/vikhyat/moondream)
|
9 |
|
10 |
-
This repository contains the 2025-04-14 **4bit** release of Moondream. On an Nvidia RTX 3090, it uses 2,305 MB of VRAM and runs at a speed of 187 tokens/second. We used quantization-aware training techniques to build this version of the model, allowing us to achieve a 45% reduction in memory usage with only an
|
11 |
|
12 |
There's more information about this version of the model in our [release blog post](https://moondream.ai/blog/smaller-faster-moondream-with-qat). Other revisions, as well as release history, can be found [here](https://huggingface.co/vikhyatk/moondream2).
|
13 |
|
|
|
7 |
|
8 |
[Website](https://moondream.ai/) / [Demo](https://moondream.ai/playground) / [GitHub](https://github.com/vikhyat/moondream)
|
9 |
|
10 |
+
This repository contains the 2025-04-14 **4bit** release of Moondream. On an Nvidia RTX 3090, it uses 2,305 MB of VRAM and runs at a speed of 187 tokens/second. We used quantization-aware training techniques to build this version of the model, allowing us to achieve a 45% reduction in memory usage with only an 2% drop in accuracy.
|
11 |
|
12 |
There's more information about this version of the model in our [release blog post](https://moondream.ai/blog/smaller-faster-moondream-with-qat). Other revisions, as well as release history, can be found [here](https://huggingface.co/vikhyatk/moondream2).
|
13 |
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:dfce186edf359fff98d0c077ae389b980b6cae99279d157fc00b2d03ca65968f
|
3 |
+
size 2032380848
|
text.py
CHANGED
@@ -187,7 +187,7 @@ def build_text_model(config: TextConfig, dtype: torch.dtype) -> nn.Module:
|
|
187 |
]
|
188 |
),
|
189 |
"post_ln": nn.LayerNorm(config.dim, dtype=dtype),
|
190 |
-
"lm_head":
|
191 |
}
|
192 |
)
|
193 |
text.wte = nn.Parameter(torch.empty(config.vocab_size, config.dim, dtype=dtype))
|
|
|
187 |
]
|
188 |
),
|
189 |
"post_ln": nn.LayerNorm(config.dim, dtype=dtype),
|
190 |
+
"lm_head": nn.Linear(config.dim, config.vocab_size, dtype=dtype),
|
191 |
}
|
192 |
)
|
193 |
text.wte = nn.Parameter(torch.empty(config.vocab_size, config.dim, dtype=dtype))
|