cgus
/

Josiefied-Qwen3-8B-abliterated-v1-exl2

Text Generation

4-bit precision

Model card Files Files and versions Community

cgus commited on May 7

Commit

ddc0a3e

·

verified ·

1 Parent(s): 32abcf4

Update README.md

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -20,7 +20,10 @@ Based on: [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) by [Qwen](https://hug
 ## Quantization notes
 Made with Exllamav2 0.2.9 dev branch with default dataset. You need either to wait for next exllamav2 release or install it from [dev branch](https://github.com/turboderp-org/exllamav2/tree/dev) to use these quants.
 It can be used with RTX GPU on Windows or RTX/ROCm card on Linux with TabbyAPI or Text-Generation-WebUI.
-Ensure you have enough VRAM to run it since it exllamav2 doesn't support RAM offloading.
 # Original model card
 # JOSIEFIED Model Family

 ## Quantization notes
 Made with Exllamav2 0.2.9 dev branch with default dataset. You need either to wait for next exllamav2 release or install it from [dev branch](https://github.com/turboderp-org/exllamav2/tree/dev) to use these quants.
 It can be used with RTX GPU on Windows or RTX/ROCm card on Linux with TabbyAPI or Text-Generation-WebUI.
+Ensure you have enough VRAM to run it since it exllamav2 doesn't support RAM offloading.
+When I used it with TabbyAPI + SillyTavern, I had to explicitly uncheck "Add BOS token" token to make the model work properly. Otherwise it looped output.
+However the model worked perfectly fine with TabbyAPI + OpenWebUI.
 # Original model card
 # JOSIEFIED Model Family