Updated Model

by D0schie - opened Jun 25

Discussion

D0schie

Jun 25

At first glance the IQ1_S model stopped to loop indefinitely after i asked it about itself so thanks for the update!

Mungert

Owner Jun 25

At first glance the IQ1_S model stopped to loop indefinitely after i asked it about itself so thanks for the update!
Thanks for feedback

The iq1_s quant has not updated yet. Unfortunately my hardware is limited so its slow. Check on the file update time in about 10 hours maybe. I would be interesting to see if the test you made with the iq1_s model shows improvements with the new 1q_s quant.

D0schie

Jun 25

Oh i see dont worry i will give feedback! Do you btw know if i can just update the LLM in LMStudio or do i need to download the whole thing again?

D0schie

Jun 25

are you doing all of this work just on a zen4?!

Mungert

Owner Jun 25

•

edited Jun 25

are you doing all of this work just on a zen4?!

Yeah, well not exactly i use two vps hired from contabo.com . One VPS with 6GB ram 4 threads of a AMD EPYC 9224 this has avx512, other one 12GB ram 6 threads of a Intel Core Processor Broadwell only has avx2. They run about the same speed . Adds up to less compute than a Zen4 but there is no lower selection in the profile page. That's all I can afford at the moment :(

Mungert

Owner Jun 25

Oh i see dont worry i will give feedback! Do you btw know if i can just update the LLM in LMStudio or do i need to download the whole thing again?

You will need to download the whole gguf file again as it will have different weights for pretty much all the tensors.

D0schie

Jun 25

are you doing all of this work just on a zen4?!

Yeah, well not exactly i use two vps hired from contabo.com . One VPS with 6GB ram 4 threads of a AMD EPYC 9224 this has avx512, other one 12GB ram 6 threads of a Intel Core Processor Broadwell only has avx2. They run about the same speed . Adds up to less compute than a Zen4 but there is no lower selection in the profile page. That's all I can afford at the moment :(

nah thats awesome keep going!

Ricmod

Jun 29

are you doing all of this work just on a zen4?!

Yeah, well not exactly i use two vps hired from contabo.com . One VPS with 6GB ram 4 threads of a AMD EPYC 9224 this has avx512, other one 12GB ram 6 threads of a Intel Core Processor Broadwell only has avx2. They run about the same speed . Adds up to less compute than a Zen4 but there is no lower selection in the profile page. That's all I can afford at the moment :(

Make sure you're using AVX512 builds on the epyc. avx512 support can put cpu performance within an order of magnitude or two of a gpu.

Mungert

Owner Jun 29

are you doing all of this work just on a zen4?!

Yeah, well not exactly i use two vps hired from contabo.com . One VPS with 6GB ram 4 threads of a AMD EPYC 9224 this has avx512, other one 12GB ram 6 threads of a Intel Core Processor Broadwell only has avx2. They run about the same speed . Adds up to less compute than a Zen4 but there is no lower selection in the profile page. That's all I can afford at the moment :(

Make sure you're using AVX512 builds on the epyc. avx512 support can put cpu performance within an order of magnitude or two of a gpu.

Yes building openblas on the epyc hosts machine detects and builds with avx512. That helps with prompt processing. Not totally sure but I think llama.cpp also builds with avx512 as well

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment