Text Generation
GGUF
English
creative
creative writing
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
science fiction
romance
all genres
story
writing
vivid prose
vivid writing
fiction
roleplaying
bfloat16
swearing
rp
llama3
llama-3
enhanced quants
max quants
maxcpu quants
horror
mergekit
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -36,9 +36,8 @@ pipeline_tag: text-generation
|
|
36 |
|
37 |
<B>L3-Dark-Planet-8B-GGUF - Updates Dec 21 2024: (uploading quants ... refreshed, and new quants):</B>
|
38 |
- All quants have been "refreshed", quanted with the lastest LLAMACPP improvements : Better instruction following, output generation across all quants.
|
39 |
-
- All quants have also been upgraded with "more bits" for output tensor and embed for better performance (this is in addition to the "refresh")
|
40 |
-
-
|
41 |
-
- New "ARM" quants have been added for machines than can run them. (format: ".../Q4_0_4_4.gguf")
|
42 |
- New specialized quants (in addition to the new refresh/upgrades): "max, max-cpu" (will include this in the file name) for quants "Q2K" (max cpu only), "IQ4_XS", "Q6_K" and "Q8_0"
|
43 |
- "MAX": output tensor / embed at float 16. (better instruction following/output generation than standard quants)
|
44 |
- "MAX-CPU": output tensor / embed at bfloat 16, which forces these on to the CPU (Nvidia cards / other will vary), this frees up vram at cost of token/second and you get better instruction following/output generation too.
|
|
|
36 |
|
37 |
<B>L3-Dark-Planet-8B-GGUF - Updates Dec 21 2024: (uploading quants ... refreshed, and new quants):</B>
|
38 |
- All quants have been "refreshed", quanted with the lastest LLAMACPP improvements : Better instruction following, output generation across all quants.
|
39 |
+
- All quants have also been upgraded with "more bits" for output tensor (all set at Q8_0) and embed for better performance (this is in addition to the "refresh")
|
40 |
+
- New "ARM" quants have been added for machines than can run them and output tensor set at Q8_0. (format: ".../Q4_0_4_4.gguf")
|
|
|
41 |
- New specialized quants (in addition to the new refresh/upgrades): "max, max-cpu" (will include this in the file name) for quants "Q2K" (max cpu only), "IQ4_XS", "Q6_K" and "Q8_0"
|
42 |
- "MAX": output tensor / embed at float 16. (better instruction following/output generation than standard quants)
|
43 |
- "MAX-CPU": output tensor / embed at bfloat 16, which forces these on to the CPU (Nvidia cards / other will vary), this frees up vram at cost of token/second and you get better instruction following/output generation too.
|