Updated Model

#2
by D0schie - opened

At first glance the IQ1_S model stopped to loop indefinitely after i asked it about itself so thanks for the update!

Owner

At first glance the IQ1_S model stopped to loop indefinitely after i asked it about itself so thanks for the update!
Thanks for feedback

The iq1_s quant has not updated yet. Unfortunately my hardware is limited so its slow. Check on the file update time in about 10 hours maybe. I would be interesting to see if the test you made with the iq1_s model shows improvements with the new 1q_s quant.

Oh i see dont worry i will give feedback! Do you btw know if i can just update the LLM in LMStudio or do i need to download the whole thing again?

are you doing all of this work just on a zen4?!

Owner
β€’
edited Jun 25

are you doing all of this work just on a zen4?!

Yeah, well not exactly i use two vps hired from contabo.com . One VPS with 6GB ram 4 threads of a AMD EPYC 9224 this has avx512, other one 12GB ram 6 threads of a Intel Core Processor Broadwell only has avx2. They run about the same speed . Adds up to less compute than a Zen4 but there is no lower selection in the profile page. That's all I can afford at the moment :(

Owner

Oh i see dont worry i will give feedback! Do you btw know if i can just update the LLM in LMStudio or do i need to download the whole thing again?

You will need to download the whole gguf file again as it will have different weights for pretty much all the tensors.

are you doing all of this work just on a zen4?!

Yeah, well not exactly i use two vps hired from contabo.com . One VPS with 6GB ram 4 threads of a AMD EPYC 9224 this has avx512, other one 12GB ram 6 threads of a Intel Core Processor Broadwell only has avx2. They run about the same speed . Adds up to less compute than a Zen4 but there is no lower selection in the profile page. That's all I can afford at the moment :(

nah thats awesome keep going!

are you doing all of this work just on a zen4?!

Yeah, well not exactly i use two vps hired from contabo.com . One VPS with 6GB ram 4 threads of a AMD EPYC 9224 this has avx512, other one 12GB ram 6 threads of a Intel Core Processor Broadwell only has avx2. They run about the same speed . Adds up to less compute than a Zen4 but there is no lower selection in the profile page. That's all I can afford at the moment :(

Make sure you're using AVX512 builds on the epyc. avx512 support can put cpu performance within an order of magnitude or two of a gpu.

Owner

are you doing all of this work just on a zen4?!

Yeah, well not exactly i use two vps hired from contabo.com . One VPS with 6GB ram 4 threads of a AMD EPYC 9224 this has avx512, other one 12GB ram 6 threads of a Intel Core Processor Broadwell only has avx2. They run about the same speed . Adds up to less compute than a Zen4 but there is no lower selection in the profile page. That's all I can afford at the moment :(

Make sure you're using AVX512 builds on the epyc. avx512 support can put cpu performance within an order of magnitude or two of a gpu.

Yes building openblas on the epyc hosts machine detects and builds with avx512. That helps with prompt processing. Not totally sure but I think llama.cpp also builds with avx512 as well

Sign up or log in to comment