cgus commited on
Commit
ddc0a3e
·
verified ·
1 Parent(s): 32abcf4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -20,7 +20,10 @@ Based on: [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) by [Qwen](https://hug
20
  ## Quantization notes
21
  Made with Exllamav2 0.2.9 dev branch with default dataset. You need either to wait for next exllamav2 release or install it from [dev branch](https://github.com/turboderp-org/exllamav2/tree/dev) to use these quants.
22
  It can be used with RTX GPU on Windows or RTX/ROCm card on Linux with TabbyAPI or Text-Generation-WebUI.
23
- Ensure you have enough VRAM to run it since it exllamav2 doesn't support RAM offloading.
 
 
 
24
 
25
  # Original model card
26
  # JOSIEFIED Model Family
 
20
  ## Quantization notes
21
  Made with Exllamav2 0.2.9 dev branch with default dataset. You need either to wait for next exllamav2 release or install it from [dev branch](https://github.com/turboderp-org/exllamav2/tree/dev) to use these quants.
22
  It can be used with RTX GPU on Windows or RTX/ROCm card on Linux with TabbyAPI or Text-Generation-WebUI.
23
+ Ensure you have enough VRAM to run it since it exllamav2 doesn't support RAM offloading.
24
+
25
+ When I used it with TabbyAPI + SillyTavern, I had to explicitly uncheck "Add BOS token" token to make the model work properly. Otherwise it looped output.
26
+ However the model worked perfectly fine with TabbyAPI + OpenWebUI.
27
 
28
  # Original model card
29
  # JOSIEFIED Model Family