Promt time (ollama) on 22c xenon, 5070 ti, 128GB ram. (Q6_K_L)

#12
by MikeZeroTango - opened

The purpose of this post is to enlighten the model a bit more to newbie users (like me), with some, to us, more understandable data.

Model: THUDM_GLM-4-32B-0414-Q6_K_L.gguf

The prompt took 4 minutes on my system:
GPU utilization: 15% -20%
CLU utilization: 50%

Untitled.png

The prompt took 1 minute on my system:
GPU utilization: 15% -20%
CLU utilization: 50%

Untitled1.png

Sign up or log in to comment