the long awaited Stheno Ultra

by Utochi - opened Jun 25

Jun 25

once Q8 drops i will give this a review. been looking forward to this model for a while now so a big THANK YOU for making this @DavidAU

DavidAU

Owner Jun 25

•

edited Jun 25

Suggest you try the IQ4XS , Q4s or Q6. Q8s (a lot of models) seem to be "flat" in terms of performance.
NEO tech performance is at it's maximum at IQ4XS , Q4s, and Q5s.
Although this does not apply to all Q8s , there is enough evidence by my own testing and feedback of others to merit this advice.
Likewise in terms of creativity any QxKM quant seems to be stronger in terms of creativity.
This varies more sharply in "IQs".
That being said, I am looking into ways to make Q8 quants more powerful.

Sir-Dan

Jun 29

Thank you @DavidAU ! These quants are amazing in my shallow tests on poltato PC. Any chance you could make Nymeria 8B (less horny Stheno) quants?

DavidAU

Owner Jun 30

Thank you for your feedback.
RE: Nymeria 8b ;
Can you give me some details about this one? pluses? minuses?

Thanks again.

Sir-Dan

Jun 30

To me it feels like a softened/diluted Stheno, which helps a lot in the kind of consensual RP (i.e. cuddling) where you don't necessarily want to steer the situation into NSFW. Stheno sometimes starts feeling horny in seemingly uninviting situations and forces NSFW, even when instructed not to do so in the system prompt. Nymeria is more balanced in my experience. She also follows the system prompt very precisely. I have a 1400-token prompt and she takes everything into account successfully.

DavidAU

Owner Jun 30

Thank you for the quick reply; I will check into this model.
Balance in the NSFW area so to speak is something people are asking for.

Utochi

Jul 1

reviewing Stheno ultra
honestly for me i didnt notice a lot of difference though Q6 Ultra feels very similar to q8 stheno normal. q8 ultra wasnt as smart as i expected it to be. i tested using llama 3 instruct template but i intend to test with command r template once im able to

DavidAU

Owner Jul 1

Try the IQ4, Q4s ; this is where the biggest differences will be. Q5 will be good, but Q6/Q8 will be less because of how the dataset is applied at these quants.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment