Model is over replying to user request
#1
by
Narutoouz
- opened
I tested q8 gguf quants of this model. It is replying random things for a simple hi message. I thought the issue was with quantisationm but I tried another q8 quant of same model, it also shows same behaviour. It is not issue with llama.cpp , because 8bit mlx model also showed same behaviour. Here , I am showing the supporting images.
Narutoouz
changed discussion status to
closed