Anyone as Failed to load model with llama.cpp
Tried multiple config, enabling/disabling :
-flash attention; qv cache Q8; context size; max gpu layers (28)
And i am getting "llama_model_load: error loading model"
LMStudio in the other hand, can load the model. Anyone with the same problem ?
That's the kind of error you get with a very old runtime, did you build it recently?
I've just built it know and still get the same error. Really don't know what's wrong, did not saw anything mentioning this problem anywhere.
git describe --tags
b5133
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Wed_Oct_30_01:18:48_Pacific_Daylight_Time_2024
Cuda compilation tools, release 12.6, V12.6.85
Build cuda_12.6.r12.6/compiler.35059454_0
Ok found what's the problem, it seems related to the disk, when model switch to SSD (previously hdd) it suddently worked. Don't know why llama.cpp don't achieve to load the model from HDD whereas LM Studio based on llama.cpp can.
Dafuq, that's extremely unexpected, almost like llama.cpp is getting impatient while loading and assuming it fails..?