Context length
#1
by
narpas
- opened
How much do you mean by "semi-decent" context? I'm trying to decide which of your quants to dedicate my download bandwidth to.
Hi there, I could load 16k at fp16 cache without issues and with a good amount of VRAM left, so I think 32k at fp16 cache is also possible.
Now with q8 cache 64k should be also pretty factible, but haven't tested.