Model is over replying to user request

by Narutoouz - opened 24 days ago

24 days ago

•

I tested q8 gguf quants of this model. It is replying random things for a simple hi message. I thought the issue was with quantisationm but I tried another q8 quant of same model, it also shows same behaviour. It is not issue with llama.cpp , because 8bit mlx model also showed same behaviour. Here , I am showing the supporting images.

Narutoouz

24 days ago

There is similar issue with 32b model also. I don't if it is the jinja template or what is causing the issue?

Narutoouz changed discussion status to closed 24 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment