the long awaited Stheno Ultra
Suggest you try the IQ4XS , Q4s or Q6. Q8s (a lot of models) seem to be "flat" in terms of performance.
NEO tech performance is at it's maximum at IQ4XS , Q4s, and Q5s.
Although this does not apply to all Q8s , there is enough evidence by my own testing and feedback of others to merit this advice.
Likewise in terms of creativity any QxKM quant seems to be stronger in terms of creativity.
This varies more sharply in "IQs".
That being said, I am looking into ways to make Q8 quants more powerful.
Thank you for your feedback.
RE: Nymeria 8b ;
Can you give me some details about this one? pluses? minuses?
Thanks again.
To me it feels like a softened/diluted Stheno, which helps a lot in the kind of consensual RP (i.e. cuddling) where you don't necessarily want to steer the situation into NSFW. Stheno sometimes starts feeling horny in seemingly uninviting situations and forces NSFW, even when instructed not to do so in the system prompt. Nymeria is more balanced in my experience. She also follows the system prompt very precisely. I have a 1400-token prompt and she takes everything into account successfully.
Thank you for the quick reply; I will check into this model.
Balance in the NSFW area so to speak is something people are asking for.
reviewing Stheno ultra
honestly for me i didnt notice a lot of difference though Q6 Ultra feels very similar to q8 stheno normal. q8 ultra wasnt as smart as i expected it to be. i tested using llama 3 instruct template but i intend to test with command r template once im able to
Try the IQ4, Q4s ; this is where the biggest differences will be. Q5 will be good, but Q6/Q8 will be less because of how the dataset is applied at these quants.