Context length

#1
by narpas - opened

How much do you mean by "semi-decent" context? I'm trying to decide which of your quants to dedicate my download bandwidth to.

Hi there, I could load 16k at fp16 cache without issues and with a good amount of VRAM left, so I think 32k at fp16 cache is also possible.

Now with q8 cache 64k should be also pretty factible, but haven't tested.

Sign up or log in to comment