can't load model in text-generation-webui
whenever I try to load this model in oobabooga, I get this error printout in the console.
23:32:05-133346 INFO Loading "TheBloke_mixtral-8x7b-v0.1-AWQ"
Replacing layers...: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 32/32 [00:03<00:00, 10.21it/s]
Fusing layers...: 81%|βββββββββββββββββββββββββββββββββββββββββββββββββββ | 26/32 [01:37<00:22, 3.75s/it]
23:34:02-522888 ERROR Failed to load the model.
Traceback (most recent call last):
File "B:\local-ai-chat\text-generation-webui-main\modules\ui_model_menu.py", line 245, in load_model_wrapper
shared.model, shared.tokenizer = load_model(selected_model, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "B:\local-ai-chat\text-generation-webui-main\modules\models.py", line 87, in load_model
output = load_func_maploader
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "B:\local-ai-chat\text-generation-webui-main\modules\models.py", line 302, in AutoAWQ_loader
model = AutoAWQForCausalLM.from_quantized(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "B:\local-ai-chat\text-generation-webui-main\installer_files\env\Lib\site-packages\awq\models\auto.py", line 94, in from_quantized
return AWQ_CAUSAL_LM_MODEL_MAP[model_type].from_quantized(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "B:\local-ai-chat\text-generation-webui-main\installer_files\env\Lib\site-packages\awq\models\base.py", line 440, in from_quantized
self.fuse_layers(model)
File "B:\local-ai-chat\text-generation-webui-main\installer_files\env\Lib\site-packages\awq\models\mixtral.py", line 25, in fuse_layers
fuser.fuse_transformer()
File "B:\local-ai-chat\text-generation-webui-main\installer_files\env\Lib\site-packages\awq\models\mixtral.py", line 161, in fuse_transformer
MixtralBlock(
File "B:\local-ai-chat\text-generation-webui-main\installer_files\env\Lib\site-packages\awq\modules\fused\block.py", line 25, in init
self.norm_1 = norm_1.to(dev)
^^^^^^^^^^^^^^
File "B:\local-ai-chat\text-generation-webui-main\installer_files\env\Lib\site-packages\torch\nn\modules\module.py", line 1152, in to
return self._apply(convert)
^^^^^^^^^^^^^^^^^^^^
File "B:\local-ai-chat\text-generation-webui-main\installer_files\env\Lib\site-packages\torch\nn\modules\module.py", line 825, in _apply
param_applied = fn(param)
^^^^^^^^^
File "B:\local-ai-chat\text-generation-webui-main\installer_files\env\Lib\site-packages\torch\nn\modules\module.py", line 1150, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
NotImplementedError: Cannot copy out of meta tensor; no data!
I have tried redownloading the model, and giving it less gpu vram, along with lowering the context. but nothing seems to make a difference in getting it to load. does anyone have any ideas? I'm used to either koboldcpp or tabbyapi but neither one would be able to run this model correctly.