Commit
·
27a9813
1
Parent(s):
8f1eae7
"Update README.md"
Browse files
README.md
CHANGED
@@ -20,16 +20,43 @@ datasets:
|
|
20 |
- OpenAssistant/oasst1
|
21 |
library_name: transformers
|
22 |
---
|
23 |
-

|
|
|
24 |
## I am still building the structure of these descriptions.
|
25 |
-
These will carry increasingly more content to help find the best models for a purpose.
|
26 |
|
27 |
-
|
28 |
|
29 |
-
|
|
|
|
|
30 |
|
31 |
-
# Original Model Card:
|
32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
# Open-Assistant Falcon 7B SFT OASST-TOP1 Model
|
34 |
|
35 |
This model is a fine-tuning of TII's [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) LLM.
|
@@ -142,4 +169,10 @@ deepspeed trainer_sft.py --configs defaults falcon-7b oasst-top1 --cache_dir <da
|
|
142 |
Export command:
|
143 |
```
|
144 |
python export_model.py --dtype bf16 --hf_repo_name OpenAssistant/falcon-7b-sft-top1 --trust_remote_code --auth_token <auth_token> <output_path> --max_shard_size 2GB
|
145 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
- OpenAssistant/oasst1
|
21 |
library_name: transformers
|
22 |
---
|
23 |
+
[]()
|
24 |
+
|
25 |
## I am still building the structure of these descriptions.
|
|
|
26 |
|
27 |
+
These will contain increasingly more content to help find the best models for a purpose.
|
28 |
|
29 |
+
# falcon-7b-sft-top1-696 - GGUF
|
30 |
+
- Model creator: [OpenAssistant](https://huggingface.co/OpenAssistant)
|
31 |
+
- Original model: [falcon-7b-sft-top1-696](https://huggingface.co/OpenAssistant/falcon-7b-sft-top1-696)
|
32 |
|
|
|
33 |
|
34 |
+
|
35 |
+
# About GGUF format
|
36 |
+
|
37 |
+
`gguf` is the current file format used by the [`ggml`](https://github.com/ggerganov/ggml) library.
|
38 |
+
A growing list of Software is using it and can therefore use this model.
|
39 |
+
The core project making use of the ggml library is the [llama.cpp](https://github.com/ggerganov/llama.cpp) project by Georgi Gerganov
|
40 |
+
|
41 |
+
# Quantization variants
|
42 |
+
|
43 |
+
There is a bunch of quantized files available. How to choose the best for you:
|
44 |
+
|
45 |
+
# legacy quants
|
46 |
+
|
47 |
+
Q4_0, Q4_1, Q5_0, Q5_1 and Q8 are `legacy` quantization types.
|
48 |
+
Nevertheless, they are fully supported, as there are several circumstances that cause certain model not to be compatible with the modern K-quants.
|
49 |
+
Falcon 7B models cannot be quantized to K-quants.
|
50 |
+
|
51 |
+
# K-quants
|
52 |
+
|
53 |
+
K-quants are based on the idea that the quantization of certain parts affects the quality in different ways. If you quantize certain parts more and others less, you get a more powerful model with the same file size, or a smaller file size and lower memory load with comparable performance.
|
54 |
+
So, if possible, use K-quants.
|
55 |
+
With a Q6_K you should find it really hard to find a quality difference to the original model - ask your model two times the same question and you may encounter bigger quality differences.
|
56 |
+
|
57 |
+
|
58 |
+
|
59 |
+
# Original Model Card:
|
60 |
# Open-Assistant Falcon 7B SFT OASST-TOP1 Model
|
61 |
|
62 |
This model is a fine-tuning of TII's [Falcon 7B](https://huggingface.co/tiiuae/falcon-7b) LLM.
|
|
|
169 |
Export command:
|
170 |
```
|
171 |
python export_model.py --dtype bf16 --hf_repo_name OpenAssistant/falcon-7b-sft-top1 --trust_remote_code --auth_token <auth_token> <output_path> --max_shard_size 2GB
|
172 |
+
```<center>
|
173 |
+
[](https://maddes8cht.github.io)
|
174 |
+
[](https://stackexchange.com/users/26485911)
|
175 |
+
[](https://github.com/maddes8cht)
|
176 |
+
[](https://huggingface.co/maddes8cht)
|
177 |
+
[](https://twitter.com/maddes1966)
|
178 |
+
</center>
|