Update README.md
Browse files
README.md
CHANGED
|
@@ -24,7 +24,7 @@ Gemma v2 is a large language model released by Google on Jun 27th 2024.
|
|
| 24 |
- Original model: [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it)
|
| 25 |
|
| 26 |
The model is packaged into executable weights, which we call
|
| 27 |
-
[llamafiles](https://github.com/Mozilla-Ocho/llamafile)
|
| 28 |
easy to use the model on Linux, MacOS, Windows, FreeBSD, OpenBSD, and
|
| 29 |
NetBSD for AMD64 and ARM64.
|
| 30 |
|
|
@@ -75,11 +75,9 @@ of the README.
|
|
| 75 |
|
| 76 |
When using the browser GUI, you need to fill out the following fields.
|
| 77 |
|
| 78 |
-
Prompt template:
|
| 79 |
|
| 80 |
```
|
| 81 |
-
<start_of_turn>system
|
| 82 |
-
{{prompt}}<end_of_turn>
|
| 83 |
{{history}}
|
| 84 |
<start_of_turn>{{char}}
|
| 85 |
```
|
|
@@ -109,9 +107,14 @@ AMD64.
|
|
| 109 |
|
| 110 |
## About Quantization Formats
|
| 111 |
|
| 112 |
-
This model works
|
| 113 |
-
|
| 114 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 115 |
|
| 116 |
---
|
| 117 |
|
|
|
|
| 24 |
- Original model: [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it)
|
| 25 |
|
| 26 |
The model is packaged into executable weights, which we call
|
| 27 |
+
[llamafiles](https://github.com/Mozilla-Ocho/llamafile). This makes it
|
| 28 |
easy to use the model on Linux, MacOS, Windows, FreeBSD, OpenBSD, and
|
| 29 |
NetBSD for AMD64 and ARM64.
|
| 30 |
|
|
|
|
| 75 |
|
| 76 |
When using the browser GUI, you need to fill out the following fields.
|
| 77 |
|
| 78 |
+
Prompt template (note: this is for chat; Gemma doesn't have a system role):
|
| 79 |
|
| 80 |
```
|
|
|
|
|
|
|
| 81 |
{{history}}
|
| 82 |
<start_of_turn>{{char}}
|
| 83 |
```
|
|
|
|
| 107 |
|
| 108 |
## About Quantization Formats
|
| 109 |
|
| 110 |
+
This model works well with any quantization format. Q6\_K is the best
|
| 111 |
+
choice overall here. We tested that, with [our 27b Gemma2
|
| 112 |
+
llamafiles](https://huggingface.co/jartine/gemma-2-27b-it-llamafile),
|
| 113 |
+
that the llamafile implementation of Gemma2 is able to to produce
|
| 114 |
+
identical responses to the Gemma2 model that's hosted by Google on
|
| 115 |
+
aistudio.google.com. Therefore we'd assume these 9b llamafiles are also
|
| 116 |
+
faithful to Google's intentions. If you encounter any divergences, then
|
| 117 |
+
try using the BF16 weights, which have the original fidelity.
|
| 118 |
|
| 119 |
---
|
| 120 |
|