Fanboy + made 64k/192k version + NEO/HORROR Imatrix for both.

#5
by DavidAU - opened

Okay, I have to admit I am a fan.
Your method of "de-censoring" and "fine tuning" wins hands down.

I tested and uploaded 64k/192k versions of your model (source) + created Imatrix NEO and HORROR
for both sizes as well (64k upload, 192k uploading today).

64K: [link to NEO version on this page] ; still updating pages (!)
https://huggingface.co/DavidAU/Qwen3-8B-64k-Josiefied-Uncensored-HORROR-Max-GGUF

192k:
https://huggingface.co/DavidAU/Qwen3-8B-192k-Josiefied-Uncensored-NEO-Max-GGUF
(uploading currently, HORROR version to follow)

I ran HORROR Imat IQ3_M 192k version - which output 13k from a single, short, fiction/horror prompt.
(adding this to repo shortly, later today)

Note that "reg" IQ3_M (non imatrix) does not work well - this was expected - you really need an Imatrix at this quant level.

General QWEN 3 note:

I have found Imatrix significantly improves / changes performance of all Qwen3s ; and there are much larger
differences between different Imatrix datasets too.

Likewise changing context via YARN (as per tech notes at Qwen's repo) also affect performance, imatrix "effects",
and so on too. So far taken Qwen 8B to 320k . Imat NEO/HORROR IQ3_M still works great - even at 320k.

(these are also up at my repo too - all sizes 64k to 320k)

Thanks!!!! The README has been updated to include your versions too.

Goekdeniz-Guelmez changed discussion status to closed
Goekdeniz-Guelmez changed discussion status to open

Sign up or log in to comment