Does these quants support MLA?

by Panchovix - opened May 9

May 9

Hi there, thanks for your work!

I was wondering, does your DeepSeek V3 0324 support MLA? From this PR https://github.com/ggml-org/llama.cpp/pull/12801

As this reduces the VRAM usage, for example, at 16K ctx from 80GB VRAM to 2GB VRAM.

Thanks!

Owner May 9

No I did not remake them yet because of the issues mentioned towards the bottom of that PR

May 9

I see! It seems it fixed since some days ago

Owner May 9

oooo good catch, okay i'll probably look at remaking then..

nulled

May 20

Just bumping it. :)

Owner May 20

Thanks, but I still have concerns over this issue mentioned by ikawrakow, I'm not sure how to adequately address it:

I was hoping Johannes would possibly figure it out in the meantime but I don't think he has

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment