Quaints please - ICONNAI/ICONN-1-Mini, ICONNAI/ICONN-1 and ICONNAI/ICONN-e1

#1065
by Enderchef - opened

If you cannot do them all, prioritize ICONN 1 Mini.

Maybe don't put them in the title which gets truncated. Please provide a list with full URLs to the model (so I can klick them) and we will try to quant them all.

ICONNAI/ICONN-1-Mini,
ICONNAI/ICONN-1
and ICONNAI/ICONN-e1. The last 2 ones are 92B so you don't have to do them if you can't.

The links are https://huggingface.co/ICONNAI/ICONN-1-Mini, https://huggingface.co/ICONNAI/ICONN-1 and https://huggingface.co/ICONNAI/ICONN-e1

They are all queued! :D
Static quants are currently beeing computed while weighted/imatrix quants will start as soon Llama-4-Maverick-17B-128E RPC imatrix computation is done in around 12 hours.

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at the following URLs to quants to appear:

I uploaded the incorrect jinja config file for my larger model on ICONN 1 Mini by mistake, so it might not work. In about an hour, I should have things fixed.

I uploaded the incorrect jinja config file for my larger model on ICONN 1 Mini by mistake, so it might not work. In about an hour, I should have things fixed.

Yes ICONN 1 Mini failed. Pleasse let me know when it is fixed:

gguf serialising key  llama.rope.dimension_count value GGUFValue(value=None, type=<GGUFValueType.UINT32: 4>, sub_type=None)
Traceback (most recent call last):
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 6536, in <module>
    main()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 6530, in main
    model_instance.write()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 406, in write
    self.gguf_writer.write_kv_data_to_file()
  File "/llmjob/llama.cpp/gguf-py/gguf/gguf_writer.py", line 246, in write_kv_data_to_file
    kv_bytes += self._pack_val(val.value, val.type, add_vtype=True, sub_type=val.sub_type)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/llmjob/llama.cpp/gguf-py/gguf/gguf_writer.py", line 1041, in _pack_val
    kv_data += self._pack(pack_fmt, val, skip_pack_prefix = vtype == GGUFValueType.BOOL)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/llmjob/llama.cpp/gguf-py/gguf/gguf_writer.py", line 1031, in _pack
    return struct.pack(f'{pack_prefix}{fmt}', value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
struct.error: required argument is not an integer

I fixed ICONN Lite, and it should work - I just chatted with it to make sure.

@Enderchef ICONN Lite failed again this time with this error. This is fixable. Just don't name them the same.

INFO:hf-to-gguf:token_embd.weight,           torch.bfloat16 --> F16, shape = {4096, 32001}
Traceback (most recent call last):
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 6536, in <module>
    main()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 6530, in main
    model_instance.write()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 403, in write
    self.prepare_tensors()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 2014, in prepare_tensors
    super().prepare_tensors()
  File "/llmjob/llama.cpp/convert_hf_to_gguf.py", line 365, in prepare_tensors
    self.gguf_writer.add_tensor(new_name, data, raw_dtype=data_qtype)
  File "/llmjob/llama.cpp/gguf-py/gguf/gguf_writer.py", line 386, in add_tensor
    self.add_tensor_info(name, shape, tensor.dtype, tensor.nbytes, raw_dtype=raw_dtype)
  File "/llmjob/llama.cpp/gguf-py/gguf/gguf_writer.py", line 337, in add_tensor_info
    raise ValueError(f'Duplicated tensor name {name!r}')
ValueError: Duplicated tensor name 'token_embd.weight'

I'm bit concerned about thouse tasks - they are still doing hfd after 3.5 hours. I will soon check hfd log rich1 and maybe have to nuke restart them.

-2000  168 si ICONN-e1                                     run/hfd
-2000  168 si ICONN-1                                      run/hfd

Strange I checked on rich1 and booth ICONN-1 and ICONN-e1 seam to be fully downloaded for a really long time. Unfortinately I can only find upload but not any download logs.

@mradermacher Do you have any idea why thouse two tasks are stuck inside run/hfd?

What do you mean "ICONN Lite failed again this time with this error. This is fixable. Just don't name them the same."?

What do you mean "ICONN Lite failed again this time with this error. This is fixable. Just don't name them the same."?

ValueError: Duplicated tensor name 'token_embd.weight'
You can't have two tensors with the same name in your model.

@nicoboss the job logs are in /dev/shm/, which you can access on rich1, but probably not on my hosts with llmc shell/audit.

They are not very illuminating, though:

Download complete. Moving file to ICONN-e1/tokenizer_config.json
Download complete. Moving file to ICONN-e1/tokenizer.json
Download complete. Moving file to ICONN-e1/model-00031-of-00035.safetensors
Download complete. Moving file to ICONN-e1/model-00032-of-00035.safetensors
Download complete. Moving file to ICONN-e1/model-00033-of-00035.safetensors
Download complete. Moving file to ICONN-e1/model-00034-of-00035.safetensors
Download complete. Moving file to ICONN-e1/model-00035-of-00035.safetensors

the python process that downloads stuff hangs in a semaphore. I've killed and restarted. Hopefully hf-xet 1.1.4 isn't a dud.

ValueError: Duplicated tensor name 'token_embd.weight'

That's because there are two sets of safetensor files, a common issue when people upload new weights without deleting the old ones. Another reason not to update models in-place, but create a new version. I've worked around it.

Now it fails again like this, probably because llama.rope.dimension_count is null.

  File "/llmjob/llama.cpp/gguf-py/gguf/gguf_writer.py", line 1031, in _pack
    return struct.pack(f'{pack_prefix}{fmt}', value)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
struct.error: required argument is not an integer

ICONNForCausalLM is not supported by llama.cpp (this affects at least ICONN-1)

ICONNForCausalLM is not supported by llama.cpp (this affects at least ICONN-1)

I checked and the others are luckely MixtralForCausalLM.

ICONNForCasualLM is the same as MixtralForCasualLM - I'll set it as that for the quaints.

ICONNForCasualLM is the same as MixtralForCasualLM - I'll set it as that for the quaints.

Why would change it to ICONNForCasualLM just to break compatibility with every existing tool if they are the same?

Thanks for changing it to MixtralForCasualLM . I commanded the task to update the local model with the latest version on your repository and retry quantizing it.

Sign up or log in to comment