replace with autogptq format

Browse files

Signed-off-by: wenhuach <[email protected]>

Files changed (13) hide show

README.md +0 -161
config.json +216 -350
generation_config.json +1 -1
model-00001-of-00007.safetensors +0 -3
model-00002-of-00007.safetensors +0 -3
model-00003-of-00007.safetensors +0 -3
model-00005-of-00007.safetensors +0 -3
model-00006-of-00007.safetensors +0 -3
model-00007-of-00007.safetensors +0 -3
model-00004-of-00007.safetensors → model.safetensors +2 -2
model.safetensors.index.json +0 -0
quantization_config.json +0 -360
quantize_config.json +229 -0

README.md DELETED Viewed

@@ -1,161 +0,0 @@
----
-license: apache-2.0
-datasets:
-- NeelNanda/pile-10k
----
-## Model Details
-This model is an int4 model with group_size 128 of [Qwen/Qwen2-57B-A14B-Instruct](https://huggingface.co/Qwen/Qwen2-57B-A14B-Instruct) generated by [intel/auto-round](https://github.com/intel/auto-round), auto-round is needed to run this model
-## How To Use
-### INT4 CPU/CUDA Inference
-```python
-##git clone https://github.com/intel/auto-round.git
-##cd auto-round && pip install -vvv --no-build-isolation -e .
-from auto_round import AutoHfQuantizer ##must import
-import torch
-from transformers import AutoModelForCausalLM,AutoTokenizer
-quantized_model_dir = "Intel/Qwen2-57B-A14B-Instruct-int4-inc"
-tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir)
-model = AutoModelForCausalLM.from_pretrained(
-    quantized_model_dir,
-    torch_dtype=torch.float16,
-    device_map="auto",
-)
-prompt = "There is a girl who likes adventure,"
-messages = [
-    {"role": "system", "content": "You are a helpful assistant."},
-    {"role": "user", "content": prompt}
-]
-tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir)
-text = tokenizer.apply_chat_template(
-    messages,
-    tokenize=False,
-    add_generation_prompt=True
-)
-model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
-generated_ids = model.generate(
-    model_inputs.input_ids,
-    max_new_tokens=50,  ##change this to align with the official usage
-    do_sample=False  ##change this to align with the official usage
-)
-generated_ids = [
-output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
-]
-response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
-print(response)
-##prompt = "请介绍一下阿里巴巴公司"
-##阿里巴巴集团是一家中国跨国科技公司，成立于1999年，总部位于杭州。阿里巴巴的业务涵盖了电子商务、零售、金融、物流、云计算等多个领域，是全球最大的电子商务公司之一。\n 阿里巴巴旗下拥有淘宝网、天猫、
-##prompt = "9.8大还是9.11大"
-##9.8和9.11都是小数，但是9.8比9.11大。在数学中，小数的大小取决于它们的数值，数值越大则越“大”。在这个情况下，9.8的
-##prompt = "Once upon a time,"
-##there was a kingdom far, far away. In this kingdom, there lived a beautiful princess who had hair as golden as the sun and eyes as blue as the sea. The princess was kind and gentle, and everyone in the kingdom loved her dearly.
-##prompt = "There is a girl who likes adventure,"
-##That's great to hear! Adventure can be a wonderful way to explore new places, learn new things, and challenge yourself in exciting ways. If you're looking for ideas on how to embark on an adventure, here are a few suggestions: 1.
-```
-### Evaluate the model
-pip3 install lm-eval==0.4.2
-```bash
-git clone https://github.com/intel/auto-round
-cd auto-round/examples/language-modeling
-python3 eval_042/evluation.py --model_name "Intel/Qwen2-57B-A14B-Instruct-int4-inc" --eval_bs 16  --tasks lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,mmlu,gsm8k,cmmlu,ceval-valid --trust_remote_code
-```
-| Metric                 | BF16   | INT4-AutoRound | [official GPTQ](https://huggingface.co/Qwen/Qwen2-57B-A14B-Instruct-GPTQ-Int4) |
-| :---------------------- | :------ | :-------------- | :------------------------------------------------------------ |
-| Avg                    | 0.7040 | 0.7043         | 0.6990                                                       |
-| mmlu                   | 0.7438 | 0.7408         | 0.7409                                                       |
-| cmmlu                  | 0.8505 | 0.8448         | 0.8475                                                       |
-| ceval-valid            | 0.8767 | 0.8611         | 0.8507                                                       |
-| gsm8k 5 shots (strict) | 0.7627 | 0.7657         | 0.7597                                                       |
-| lambada_openai         | 0.7452 | 0.7444         | 0.7524                                                       |
-| hellaswag              | 0.6517 | 0.6475         | 0.6471                                                       |
-| winogrande             | 0.7245 | 0.7285         | 0.7198                                                       |
-| piqa                   | 0.8058 | 0.8058         | 0.8041                                                       |
-| truthfulqa_mc1         | 0.4345 | 0.4321         | 0.4272                                                       |
-| openbookqa             | 0.3400 | 0.3560         | 0.3300                                                       |
-| boolq                  | 0.8835 | 0.8844         | 0.8810                                                       |
-| arc_easy               | 0.8035 | 0.8051         | 0.8001                                                       |
-| arc_challenge          | 0.5299 | 0.5392         | 0.5265                                                       |
-## Reproduce
-Here is the sample command to reproduce the model.
-```bash
-git clone https://github.com/intel/auto-round
-cd auto-round/examples/language-modeling
-pip install -r requirements.txt
-python3 main.py \
---model_name  Qwen/Qwen2-57B-A14B-Instruct \
---device 0 \
---group_size 128 \
---nsamples 512 \
---bits 4 \
---iter 1000 \
---disable_eval \
---fp_layers "shared_expert_gate,gate" \
---deployment_device 'auto_round' \
---output_dir "./tmp_autoround"
-```
-we found the output of model.layers.3.mlp.shared_expert.down_proj could be up to ~50k if adding chat template and will cause some backend like exllamav2 oeverflow. so after quantizing the model, please manually add this to config.json
-~~~bash
- "extra_config": {
-      "model.layers.3.mlp.shared_expert.down_proj": {
-      "clip": true
-      },
-  }
-~~~
-## Ethical Considerations and Limitations
-The model can produce factually incorrect output, and should not be relied on to produce factually accurate information. Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs.
-Therefore, before deploying any applications of the model, developers should perform safety testing.
-## Caveats and Recommendations
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
-Here are a couple of useful links to learn more about Intel's AI software:
-* Intel Neural Compressor [link](https://github.com/intel/neural-compressor)
-* Intel Extension for Transformers [link](https://github.com/intel/intel-extension-for-transformers)
-## Disclaimer
-The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please consult an attorney before using this model for commercial purposes.
-## Cite
-@article{cheng2023optimize, title={Optimize weight rounding via signed gradient descent for the quantization of llms}, author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao and Liu, Yi}, journal={arXiv preprint arXiv:2309.05516}, year={2023} }
-[arxiv](https://arxiv.org/abs/2309.05516) [github](https://github.com/intel/auto-round)

config.json CHANGED Viewed

@@ -25,375 +25,241 @@
   "output_router_logits": false,
   "quantization_config": {
     "amp": true,
-    "autoround_version": "0.3.0.dev",
-    "backend": "auto_round:exllamav2",
     "bits": 4,
     "data_type": "int",
-    "dataset": "NeelNanda/pile-10k",
     "enable_minmax_tuning": true,
     "enable_quanted_input": true,
-    "extra_config": {
-      "model.layers.3.mlp.shared_expert.down_proj": {
-	            "clip": true
-		          },
-      "model.layers.0.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.0.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.1.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.1.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.10.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.10.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.11.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.11.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.12.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.12.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.13.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.13.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.14.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.14.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.15.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.15.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.16.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.16.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.17.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.17.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.18.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.18.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.19.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.19.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.2.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.2.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.20.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.20.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.21.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.21.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.22.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.22.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.23.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.23.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.24.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.24.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.25.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.25.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.26.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.26.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.27.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.27.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.3.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.3.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.4.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.4.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.5.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.5.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.6.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.6.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.7.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.7.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.8.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.8.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.9.mlp.gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      },
-      "model.layers.9.mlp.shared_expert_gate": {
-        "bits": 32,
-        "data_type": "bfloat",
-        "group_size": null,
-        "sym": null
-      }
-    },
     "gradient_accumulate_steps": 1,
     "group_size": 128,
     "iters": 1000,
     "low_gpu_mem_usage": false,
     "lr": 0.001,
     "minmax_lr": 0.001,
     "nsamples": 512,
-    "quant_method": "intel/auto-round",
     "scale_dtype": "torch.float16",
     "seqlen": 2048,
-    "sym": false,
-    "train_bs": 8
   },
   "rms_norm_eps": 1e-06,
   "rope_theta": 1000000.0,
   "router_aux_loss_coef": 0.001,
   "shared_expert_intermediate_size": 20480,
-  "sliding_window": 65536,
   "tie_word_embeddings": false,
-  "torch_dtype": "bfloat16",
-  "transformers_version": "4.41.1",
   "use_cache": true,
   "use_sliding_window": false,
   "vocab_size": 151936

   "output_router_logits": false,
   "quantization_config": {
     "amp": true,
+    "autoround_version": "0.3.1.dev",
     "bits": 4,
+    "damp_percent": 0.01,
     "data_type": "int",
+    "desc_act": false,
     "enable_minmax_tuning": true,
+    "enable_norm_bias_tuning": false,
     "enable_quanted_input": true,
     "gradient_accumulate_steps": 1,
     "group_size": 128,
     "iters": 1000,
     "low_gpu_mem_usage": false,
     "lr": 0.001,
     "minmax_lr": 0.001,
+    "modules_in_block_to_quantize": [
+      [
+        "self_attn.q_proj",
+        "self_attn.k_proj",
+        "self_attn.v_proj",
+        "self_attn.o_proj",
+        "mlp.gate",
+        "mlp.experts.0.gate_proj",
+        "mlp.experts.0.up_proj",
+        "mlp.experts.0.down_proj",
+        "mlp.experts.1.gate_proj",
+        "mlp.experts.1.up_proj",
+        "mlp.experts.1.down_proj",
+        "mlp.experts.2.gate_proj",
+        "mlp.experts.2.up_proj",
+        "mlp.experts.2.down_proj",
+        "mlp.experts.3.gate_proj",
+        "mlp.experts.3.up_proj",
+        "mlp.experts.3.down_proj",
+        "mlp.experts.4.gate_proj",
+        "mlp.experts.4.up_proj",
+        "mlp.experts.4.down_proj",
+        "mlp.experts.5.gate_proj",
+        "mlp.experts.5.up_proj",
+        "mlp.experts.5.down_proj",
+        "mlp.experts.6.gate_proj",
+        "mlp.experts.6.up_proj",
+        "mlp.experts.6.down_proj",
+        "mlp.experts.7.gate_proj",
+        "mlp.experts.7.up_proj",
+        "mlp.experts.7.down_proj",
+        "mlp.experts.8.gate_proj",
+        "mlp.experts.8.up_proj",
+        "mlp.experts.8.down_proj",
+        "mlp.experts.9.gate_proj",
+        "mlp.experts.9.up_proj",
+        "mlp.experts.9.down_proj",
+        "mlp.experts.10.gate_proj",
+        "mlp.experts.10.up_proj",
+        "mlp.experts.10.down_proj",
+        "mlp.experts.11.gate_proj",
+        "mlp.experts.11.up_proj",
+        "mlp.experts.11.down_proj",
+        "mlp.experts.12.gate_proj",
+        "mlp.experts.12.up_proj",
+        "mlp.experts.12.down_proj",
+        "mlp.experts.13.gate_proj",
+        "mlp.experts.13.up_proj",
+        "mlp.experts.13.down_proj",
+        "mlp.experts.14.gate_proj",
+        "mlp.experts.14.up_proj",
+        "mlp.experts.14.down_proj",
+        "mlp.experts.15.gate_proj",
+        "mlp.experts.15.up_proj",
+        "mlp.experts.15.down_proj",
+        "mlp.experts.16.gate_proj",
+        "mlp.experts.16.up_proj",
+        "mlp.experts.16.down_proj",
+        "mlp.experts.17.gate_proj",
+        "mlp.experts.17.up_proj",
+        "mlp.experts.17.down_proj",
+        "mlp.experts.18.gate_proj",
+        "mlp.experts.18.up_proj",
+        "mlp.experts.18.down_proj",
+        "mlp.experts.19.gate_proj",
+        "mlp.experts.19.up_proj",
+        "mlp.experts.19.down_proj",
+        "mlp.experts.20.gate_proj",
+        "mlp.experts.20.up_proj",
+        "mlp.experts.20.down_proj",
+        "mlp.experts.21.gate_proj",
+        "mlp.experts.21.up_proj",
+        "mlp.experts.21.down_proj",
+        "mlp.experts.22.gate_proj",
+        "mlp.experts.22.up_proj",
+        "mlp.experts.22.down_proj",
+        "mlp.experts.23.gate_proj",
+        "mlp.experts.23.up_proj",
+        "mlp.experts.23.down_proj",
+        "mlp.experts.24.gate_proj",
+        "mlp.experts.24.up_proj",
+        "mlp.experts.24.down_proj",
+        "mlp.experts.25.gate_proj",
+        "mlp.experts.25.up_proj",
+        "mlp.experts.25.down_proj",
+        "mlp.experts.26.gate_proj",
+        "mlp.experts.26.up_proj",
+        "mlp.experts.26.down_proj",
+        "mlp.experts.27.gate_proj",
+        "mlp.experts.27.up_proj",
+        "mlp.experts.27.down_proj",
+        "mlp.experts.28.gate_proj",
+        "mlp.experts.28.up_proj",
+        "mlp.experts.28.down_proj",
+        "mlp.experts.29.gate_proj",
+        "mlp.experts.29.up_proj",
+        "mlp.experts.29.down_proj",
+        "mlp.experts.30.gate_proj",
+        "mlp.experts.30.up_proj",
+        "mlp.experts.30.down_proj",
+        "mlp.experts.31.gate_proj",
+        "mlp.experts.31.up_proj",
+        "mlp.experts.31.down_proj",
+        "mlp.experts.32.gate_proj",
+        "mlp.experts.32.up_proj",
+        "mlp.experts.32.down_proj",
+        "mlp.experts.33.gate_proj",
+        "mlp.experts.33.up_proj",
+        "mlp.experts.33.down_proj",
+        "mlp.experts.34.gate_proj",
+        "mlp.experts.34.up_proj",
+        "mlp.experts.34.down_proj",
+        "mlp.experts.35.gate_proj",
+        "mlp.experts.35.up_proj",
+        "mlp.experts.35.down_proj",
+        "mlp.experts.36.gate_proj",
+        "mlp.experts.36.up_proj",
+        "mlp.experts.36.down_proj",
+        "mlp.experts.37.gate_proj",
+        "mlp.experts.37.up_proj",
+        "mlp.experts.37.down_proj",
+        "mlp.experts.38.gate_proj",
+        "mlp.experts.38.up_proj",
+        "mlp.experts.38.down_proj",
+        "mlp.experts.39.gate_proj",
+        "mlp.experts.39.up_proj",
+        "mlp.experts.39.down_proj",
+        "mlp.experts.40.gate_proj",
+        "mlp.experts.40.up_proj",
+        "mlp.experts.40.down_proj",
+        "mlp.experts.41.gate_proj",
+        "mlp.experts.41.up_proj",
+        "mlp.experts.41.down_proj",
+        "mlp.experts.42.gate_proj",
+        "mlp.experts.42.up_proj",
+        "mlp.experts.42.down_proj",
+        "mlp.experts.43.gate_proj",
+        "mlp.experts.43.up_proj",
+        "mlp.experts.43.down_proj",
+        "mlp.experts.44.gate_proj",
+        "mlp.experts.44.up_proj",
+        "mlp.experts.44.down_proj",
+        "mlp.experts.45.gate_proj",
+        "mlp.experts.45.up_proj",
+        "mlp.experts.45.down_proj",
+        "mlp.experts.46.gate_proj",
+        "mlp.experts.46.up_proj",
+        "mlp.experts.46.down_proj",
+        "mlp.experts.47.gate_proj",
+        "mlp.experts.47.up_proj",
+        "mlp.experts.47.down_proj",
+        "mlp.experts.48.gate_proj",
+        "mlp.experts.48.up_proj",
+        "mlp.experts.48.down_proj",
+        "mlp.experts.49.gate_proj",
+        "mlp.experts.49.up_proj",
+        "mlp.experts.49.down_proj",
+        "mlp.experts.50.gate_proj",
+        "mlp.experts.50.up_proj",
+        "mlp.experts.50.down_proj",
+        "mlp.experts.51.gate_proj",
+        "mlp.experts.51.up_proj",
+        "mlp.experts.51.down_proj",
+        "mlp.experts.52.gate_proj",
+        "mlp.experts.52.up_proj",
+        "mlp.experts.52.down_proj",
+        "mlp.experts.53.gate_proj",
+        "mlp.experts.53.up_proj",
+        "mlp.experts.53.down_proj",
+        "mlp.experts.54.gate_proj",
+        "mlp.experts.54.up_proj",
+        "mlp.experts.54.down_proj",
+        "mlp.experts.55.gate_proj",
+        "mlp.experts.55.up_proj",
+        "mlp.experts.55.down_proj",
+        "mlp.experts.56.gate_proj",
+        "mlp.experts.56.up_proj",
+        "mlp.experts.56.down_proj",
+        "mlp.experts.57.gate_proj",
+        "mlp.experts.57.up_proj",
+        "mlp.experts.57.down_proj",
+        "mlp.experts.58.gate_proj",
+        "mlp.experts.58.up_proj",
+        "mlp.experts.58.down_proj",
+        "mlp.experts.59.gate_proj",
+        "mlp.experts.59.up_proj",
+        "mlp.experts.59.down_proj",
+        "mlp.experts.60.gate_proj",
+        "mlp.experts.60.up_proj",
+        "mlp.experts.60.down_proj",
+        "mlp.experts.61.gate_proj",
+        "mlp.experts.61.up_proj",
+        "mlp.experts.61.down_proj",
+        "mlp.experts.62.gate_proj",
+        "mlp.experts.62.up_proj",
+        "mlp.experts.62.down_proj",
+        "mlp.experts.63.gate_proj",
+        "mlp.experts.63.up_proj",
+        "mlp.experts.63.down_proj",
+        "mlp.shared_expert.gate_proj",
+        "mlp.shared_expert.up_proj",
+        "mlp.shared_expert.down_proj"
+      ]
+    ],
     "nsamples": 512,
+    "quant_block_list": null,
+    "quant_method": "gptq",
     "scale_dtype": "torch.float16",
     "seqlen": 2048,
+    "sym": true,
+    "train_bs": 8,
+    "true_sequential": false
   },
   "rms_norm_eps": 1e-06,
   "rope_theta": 1000000.0,
   "router_aux_loss_coef": 0.001,
   "shared_expert_intermediate_size": 20480,
+  "sliding_window": null,
   "tie_word_embeddings": false,
+  "torch_dtype": "float16",
+  "transformers_version": "4.44.2",
   "use_cache": true,
   "use_sliding_window": false,
   "vocab_size": 151936

generation_config.json CHANGED Viewed

@@ -10,5 +10,5 @@
   "temperature": 0.7,
   "top_k": 20,
   "top_p": 0.8,
-  "transformers_version": "4.41.1"
 }

   "temperature": 0.7,
   "top_k": 20,
   "top_p": 0.8,
+  "transformers_version": "4.44.2"
 }

model-00001-of-00007.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:167fc297862ecd2e0033de4c717668a83a6904b31c14025c6d96298f08706e4b
-size 4999673648

model-00002-of-00007.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:2e484a47e41edba04660cf6862d0ee5c2cb1d8614968c35b5ec7aa1cbe7681b3
-size 4996756752

model-00003-of-00007.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:77db3c5e4a435e4f834011974388b3f8d177e83537c5acd5764d06db3c592462
-size 4996759400

model-00005-of-00007.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:c825fa9b74f5e7f55362b085315c8db3c82e4679e0faae23c07eb1ffff8eed3c
-size 5000290584

model-00006-of-00007.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:79105d05c0ea1507176ced998d5598cb3b4f4a7678abea09d87cd87e682120b6
-size 4996760576

model-00007-of-00007.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:c1d6e7a53c719cb3c8938fbaad0e638d1f5149a40bcc3103539004ef994eb072
-size 1538228952

model-00004-of-00007.safetensors → model.safetensors RENAMED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:74f499b3d5cc607ee82042f2de2bc4ace38bd51e0b947b702ac52b39e91563b8
-size 4996760664

 version https://git-lfs.github.com/spec/v1
+oid sha256:d15f21e20a022b32b6099710f772e88b8586b50af88871fb44e2422664109fa2
+size 31475221720

model.safetensors.index.json DELETED Viewed

The diff for this file is too large to render. See raw diff

quantization_config.json DELETED Viewed

@@ -1,360 +0,0 @@
-{
-  "bits": 4,
-  "group_size": 128,
-  "sym": false,
-  "data_type": "int",
-  "enable_quanted_input": true,
-  "enable_minmax_tuning": true,
-  "seqlen": 2048,
-  "train_bs": 8,
-  "scale_dtype": "torch.float16",
-  "lr": 0.001,
-  "minmax_lr": 0.001,
-  "gradient_accumulate_steps": 1,
-  "iters": 1000,
-  "amp": true,
-  "nsamples": 512,
-  "low_gpu_mem_usage": false,
-  "dataset": "NeelNanda/pile-10k",
-  "autoround_version": "0.3.0.dev",
-  "quant_method": "intel/auto-round",
-  "backend": "auto_round:exllamav2",
-  "extra_config": {
-    "model.layers.0.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.1.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.2.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.3.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.4.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.5.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.6.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.7.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.8.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.9.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.10.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.11.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.12.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.13.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.14.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.15.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.16.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.17.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.18.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.19.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.20.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.21.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.22.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.23.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.24.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.25.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.26.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.27.mlp.shared_expert_gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.0.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.1.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.2.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.3.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.4.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.5.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.6.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.7.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.8.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.9.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.10.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.11.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.12.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.13.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.14.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.15.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.16.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.17.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.18.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.19.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.20.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.21.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.22.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.23.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.24.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.25.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.26.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    },
-    "model.layers.27.mlp.gate": {
-      "data_type": "bfloat",
-      "bits": 32,
-      "group_size": null,
-      "sym": null
-    }
-  }
-}

quantize_config.json ADDED Viewed

	@@ -0,0 +1,229 @@

+{
+  "bits": 4,
+  "group_size": 128,
+  "sym": true,
+  "data_type": "int",
+  "enable_quanted_input": true,
+  "enable_minmax_tuning": true,
+  "seqlen": 2048,
+  "train_bs": 8,
+  "scale_dtype": "torch.float16",
+  "lr": 0.001,
+  "minmax_lr": 0.001,
+  "gradient_accumulate_steps": 1,
+  "iters": 1000,
+  "amp": true,
+  "nsamples": 512,
+  "low_gpu_mem_usage": false,
+  "quant_block_list": null,
+  "enable_norm_bias_tuning": false,
+  "autoround_version": "0.3.1.dev",
+  "quant_method": "gptq",
+  "desc_act": false,
+  "true_sequential": false,
+  "damp_percent": 0.01,
+  "modules_in_block_to_quantize": [
+    [
+      "self_attn.q_proj",
+      "self_attn.k_proj",
+      "self_attn.v_proj",
+      "self_attn.o_proj",
+      "mlp.gate",
+      "mlp.experts.0.gate_proj",
+      "mlp.experts.0.up_proj",
+      "mlp.experts.0.down_proj",
+      "mlp.experts.1.gate_proj",
+      "mlp.experts.1.up_proj",
+      "mlp.experts.1.down_proj",
+      "mlp.experts.2.gate_proj",
+      "mlp.experts.2.up_proj",
+      "mlp.experts.2.down_proj",
+      "mlp.experts.3.gate_proj",
+      "mlp.experts.3.up_proj",
+      "mlp.experts.3.down_proj",
+      "mlp.experts.4.gate_proj",
+      "mlp.experts.4.up_proj",
+      "mlp.experts.4.down_proj",
+      "mlp.experts.5.gate_proj",
+      "mlp.experts.5.up_proj",
+      "mlp.experts.5.down_proj",
+      "mlp.experts.6.gate_proj",
+      "mlp.experts.6.up_proj",
+      "mlp.experts.6.down_proj",
+      "mlp.experts.7.gate_proj",
+      "mlp.experts.7.up_proj",
+      "mlp.experts.7.down_proj",
+      "mlp.experts.8.gate_proj",
+      "mlp.experts.8.up_proj",
+      "mlp.experts.8.down_proj",
+      "mlp.experts.9.gate_proj",
+      "mlp.experts.9.up_proj",
+      "mlp.experts.9.down_proj",
+      "mlp.experts.10.gate_proj",
+      "mlp.experts.10.up_proj",
+      "mlp.experts.10.down_proj",
+      "mlp.experts.11.gate_proj",
+      "mlp.experts.11.up_proj",
+      "mlp.experts.11.down_proj",
+      "mlp.experts.12.gate_proj",
+      "mlp.experts.12.up_proj",
+      "mlp.experts.12.down_proj",
+      "mlp.experts.13.gate_proj",
+      "mlp.experts.13.up_proj",
+      "mlp.experts.13.down_proj",
+      "mlp.experts.14.gate_proj",
+      "mlp.experts.14.up_proj",
+      "mlp.experts.14.down_proj",
+      "mlp.experts.15.gate_proj",
+      "mlp.experts.15.up_proj",
+      "mlp.experts.15.down_proj",
+      "mlp.experts.16.gate_proj",
+      "mlp.experts.16.up_proj",
+      "mlp.experts.16.down_proj",
+      "mlp.experts.17.gate_proj",
+      "mlp.experts.17.up_proj",
+      "mlp.experts.17.down_proj",
+      "mlp.experts.18.gate_proj",
+      "mlp.experts.18.up_proj",
+      "mlp.experts.18.down_proj",
+      "mlp.experts.19.gate_proj",
+      "mlp.experts.19.up_proj",
+      "mlp.experts.19.down_proj",
+      "mlp.experts.20.gate_proj",
+      "mlp.experts.20.up_proj",
+      "mlp.experts.20.down_proj",
+      "mlp.experts.21.gate_proj",
+      "mlp.experts.21.up_proj",
+      "mlp.experts.21.down_proj",
+      "mlp.experts.22.gate_proj",
+      "mlp.experts.22.up_proj",
+      "mlp.experts.22.down_proj",
+      "mlp.experts.23.gate_proj",
+      "mlp.experts.23.up_proj",
+      "mlp.experts.23.down_proj",
+      "mlp.experts.24.gate_proj",
+      "mlp.experts.24.up_proj",
+      "mlp.experts.24.down_proj",
+      "mlp.experts.25.gate_proj",
+      "mlp.experts.25.up_proj",
+      "mlp.experts.25.down_proj",
+      "mlp.experts.26.gate_proj",
+      "mlp.experts.26.up_proj",
+      "mlp.experts.26.down_proj",
+      "mlp.experts.27.gate_proj",
+      "mlp.experts.27.up_proj",
+      "mlp.experts.27.down_proj",
+      "mlp.experts.28.gate_proj",
+      "mlp.experts.28.up_proj",
+      "mlp.experts.28.down_proj",
+      "mlp.experts.29.gate_proj",
+      "mlp.experts.29.up_proj",
+      "mlp.experts.29.down_proj",
+      "mlp.experts.30.gate_proj",
+      "mlp.experts.30.up_proj",
+      "mlp.experts.30.down_proj",
+      "mlp.experts.31.gate_proj",
+      "mlp.experts.31.up_proj",
+      "mlp.experts.31.down_proj",
+      "mlp.experts.32.gate_proj",
+      "mlp.experts.32.up_proj",
+      "mlp.experts.32.down_proj",
+      "mlp.experts.33.gate_proj",
+      "mlp.experts.33.up_proj",
+      "mlp.experts.33.down_proj",
+      "mlp.experts.34.gate_proj",
+      "mlp.experts.34.up_proj",
+      "mlp.experts.34.down_proj",
+      "mlp.experts.35.gate_proj",
+      "mlp.experts.35.up_proj",
+      "mlp.experts.35.down_proj",
+      "mlp.experts.36.gate_proj",
+      "mlp.experts.36.up_proj",
+      "mlp.experts.36.down_proj",
+      "mlp.experts.37.gate_proj",
+      "mlp.experts.37.up_proj",
+      "mlp.experts.37.down_proj",
+      "mlp.experts.38.gate_proj",
+      "mlp.experts.38.up_proj",
+      "mlp.experts.38.down_proj",
+      "mlp.experts.39.gate_proj",
+      "mlp.experts.39.up_proj",
+      "mlp.experts.39.down_proj",
+      "mlp.experts.40.gate_proj",
+      "mlp.experts.40.up_proj",
+      "mlp.experts.40.down_proj",
+      "mlp.experts.41.gate_proj",
+      "mlp.experts.41.up_proj",
+      "mlp.experts.41.down_proj",
+      "mlp.experts.42.gate_proj",
+      "mlp.experts.42.up_proj",
+      "mlp.experts.42.down_proj",
+      "mlp.experts.43.gate_proj",
+      "mlp.experts.43.up_proj",
+      "mlp.experts.43.down_proj",
+      "mlp.experts.44.gate_proj",
+      "mlp.experts.44.up_proj",
+      "mlp.experts.44.down_proj",
+      "mlp.experts.45.gate_proj",
+      "mlp.experts.45.up_proj",
+      "mlp.experts.45.down_proj",
+      "mlp.experts.46.gate_proj",
+      "mlp.experts.46.up_proj",
+      "mlp.experts.46.down_proj",
+      "mlp.experts.47.gate_proj",
+      "mlp.experts.47.up_proj",
+      "mlp.experts.47.down_proj",
+      "mlp.experts.48.gate_proj",
+      "mlp.experts.48.up_proj",
+      "mlp.experts.48.down_proj",
+      "mlp.experts.49.gate_proj",
+      "mlp.experts.49.up_proj",
+      "mlp.experts.49.down_proj",
+      "mlp.experts.50.gate_proj",
+      "mlp.experts.50.up_proj",
+      "mlp.experts.50.down_proj",
+      "mlp.experts.51.gate_proj",
+      "mlp.experts.51.up_proj",
+      "mlp.experts.51.down_proj",
+      "mlp.experts.52.gate_proj",
+      "mlp.experts.52.up_proj",
+      "mlp.experts.52.down_proj",
+      "mlp.experts.53.gate_proj",
+      "mlp.experts.53.up_proj",
+      "mlp.experts.53.down_proj",
+      "mlp.experts.54.gate_proj",
+      "mlp.experts.54.up_proj",
+      "mlp.experts.54.down_proj",
+      "mlp.experts.55.gate_proj",
+      "mlp.experts.55.up_proj",
+      "mlp.experts.55.down_proj",
+      "mlp.experts.56.gate_proj",
+      "mlp.experts.56.up_proj",
+      "mlp.experts.56.down_proj",
+      "mlp.experts.57.gate_proj",
+      "mlp.experts.57.up_proj",
+      "mlp.experts.57.down_proj",
+      "mlp.experts.58.gate_proj",
+      "mlp.experts.58.up_proj",
+      "mlp.experts.58.down_proj",
+      "mlp.experts.59.gate_proj",
+      "mlp.experts.59.up_proj",
+      "mlp.experts.59.down_proj",
+      "mlp.experts.60.gate_proj",
+      "mlp.experts.60.up_proj",
+      "mlp.experts.60.down_proj",
+      "mlp.experts.61.gate_proj",
+      "mlp.experts.61.up_proj",
+      "mlp.experts.61.down_proj",
+      "mlp.experts.62.gate_proj",
+      "mlp.experts.62.up_proj",
+      "mlp.experts.62.down_proj",
+      "mlp.experts.63.gate_proj",
+      "mlp.experts.63.up_proj",
+      "mlp.experts.63.down_proj",
+      "mlp.shared_expert.gate_proj",
+      "mlp.shared_expert.up_proj",
+      "mlp.shared_expert.down_proj"
+    ]
+  ]
+}