bartowski
/

deepseek-ai_DeepSeek-V3.1-Base-Q4_K_M-GGUF

Model card Files Files and versions

bartowski commited on 4 days ago

Commit

48633a2

·

verified ·

1 Parent(s): 8db6a8e

Update README.md

Files changed (1) hide show

README.md +11 -1

README.md CHANGED Viewed

@@ -12,7 +12,17 @@ Original model: https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Base
 Uploading this since I'm using it to calculate imatrix, figured might as well provide it in the meantime
-Remember, this is a **BASE** model, so it likely will not chat properly unless you give it multiple turns of examples (I'm going to test a bit)
 382G total size

 Uploading this since I'm using it to calculate imatrix, figured might as well provide it in the meantime
+Remember, this is a **BASE** model, so it likely will not chat properly unless you give it multiple turns of examples, for instance I've had success with:
+```
+./llama-cli -m /models/deepseek-ai_DeepSeek-V3.1-Base-Q4_K_M-00001-of-00011.gguf -p "You are a helpful assistant.<｜User｜>Hello, how are you?<｜Assistant｜>I'm doing well thanks! Yourself?<｜User｜>I'm doing great! Can you explain the laws of thermodynamics?<｜Assistant｜>" -no-cnv -ngl 0
+```
+This resulted in a completely coherent reply:
+> The first law of thermodynamics is that energy can neither be created nor destroyed. The second law states that entropy, or disorder, in the universe will always increase. The third law states that a perfect crystal at absolute zero would have zero entropy.
+The idea is that you need to teach the base model what a conversation looks like first, base models aren't usually capable of one-shotting a conversation since it hasn't been tuned to understand roles.
 382G total size