GGUF

Model Details

This model is a mixed gguf:q2ks of Qwen/Qwen3-30B-A3B generated by intel/auto-round algorithm. Embedding layer and non expert layers are fallback to 4 bits. Please refer to Section Generate the model for more details.

How To Use

LLamacpp Inference

./llama-cli -hf Intel/Qwen3-30B-A3B-q2ks-mixed-AutoRound-inc-v0 --conversation
"""
> Hi
<think>
Okay, the user just said "Hi". I need to respond appropriately. Since it's a greeting, I should greet them back and offer assistance. Let me make sure the response is friendly and open-ended. Maybe something like "Hello! How can I assist you today?" That should work. I don't need to add anything else unless they ask for more. Keep it simple and polite.
</think>

Hello! How can I assist you today? 😊

> 9.8大还是9.11大
<think>
Okay, the user is asking whether 9.8 is bigger than 9.11. Let me think about how to approach this.

First, I need to confirm the numbers. 9.8 and 9.11. Wait, are these decimal numbers? So 9.8 is the same as 9.80, right? And 9.11 is 9.11. Comparing them, 9.80 is larger than 9.11 because 0.80 is bigger than 0.11. So 9.8 is bigger. But maybe the user is confused about decimal points. Let me check if there's any other way to interpret these numbers. Like, maybe they're talking about dates? But 9.8 and 9.11 could be September 8th and September 11th. But the question is about which is "big" – maybe the user is asking which date is later? September 11th is later than September 8th, so 9.11 would be bigger in that context. But the original question is about numerical values.

Wait, the user wrote "9.8大还是9.11大". "大" means big, so comparing the numbers. If it's about numerical values, 9.8 is larger than 9.11. But if it's about dates, September 11th is later. However, the user didn't specify context. Maybe they are asking about numbers. So I should explain both possibilities.

But in math terms, 9.8 is greater than 9.11. Because when comparing decimals, 9.8 is 9.80, which is larger than 9.11. So the answer would be 9.8 is bigger. But I should make sure to clarify that if they meant dates, then 9.11 would be later. However, since the question is in Chinese, maybe the user is asking about numbers.

I should respond by stating that 9.8 is larger than 9.11 when comparing numerical values. But also mention if they're referring to dates, then 9.11 (September 11th) is later. But since the original question is about "大", which is about size, it's likely about numbers. So the answer is 9.8 is bigger.

Wait, but in Chinese, sometimes people might write 9.8 as 9.8 or 9.11 as 9.11. So the comparison is straightforward. So the answer is 9.8 is larger.

But I need to make sure the user understands that. Maybe they are confused about decimal places. Let me explain step by step.

First, compare the whole number parts: both are 9. Then compare the decimal parts. 0.8 vs. 0.11. Since 0.8 is 0.80, which is larger than 0.11. So 9.8 is bigger.

Alternatively, if they are dates, September 11th is later than September 8th. But the user didn't specify. So I should cover both possibilities.

But the user's question is in Chinese, so maybe they are asking about numbers. So I'll explain that 9.8 is larger than 9.11 when comparing numerical values.

I should also check if there's any cultural context. For example, in some contexts, numbers might be written differently. But I think the standard decimal comparison applies here.

So the final answer is 9.8 is larger than 9.11. But I should present it clearly, maybe with an example.
</think>
在比较数值时,**9.8** 和 **9.11** 的大小关系如下:

- **9.8** 是 **9.80**(可以理解为 9.8),而 **9.11** 是 **9.11**。
- 当比较小数部分时,**0.80**(即 9.8)大于 **0.11**(即 9.11)。
- 因此,**9.8 > 9.11**。

如果问题涉及日期(例如 **9月8日** 和 **9月11日**),则 **9月11日**(9.11)是更晚的日期,但根据数值大小,**9.8** 更大。

总结:**9.8 大于 9.11**。

> 树上有十只鸟,用枪打死了一只,还剩下几只
<think>
Okay, let's see. The user is asking a riddle: "There are ten birds on a tree. A gun is used to shoot one, how many are left?" Hmm, this seems like a classic riddle that's not straightforward. Let me think about it.

First, the initial number is ten birds. If you shoot one, the obvious answer would be 10 - 1 = 9. But riddles often have a twist. Maybe the answer is different. Let me recall similar riddles. Oh, right! The trick here is that when you shoot a bird, the others might fly away because of the noise. So instead of 9, the answer could be zero because all the remaining birds would be scared and leave.

But wait, the user might be expecting that. However, sometimes the answer depends on the context. If the birds are dead, then 9 would be the answer. But if they are alive and the gun shot causes them to fly away, then none would be left. But the riddle is often designed to have the answer as zero. Let me check if that's the case.

Another angle: the question is in Chinese, so maybe there's a specific cultural reference. But I think the common answer is zero. However, I should explain both possibilities. The user might be testing if they know the trick. So the answer is either 9 or 0, depending on whether the other birds fly away.

But I need to make sure I explain the reasoning. So, the answer is zero because the sound of the gunshot would scare the remaining birds, causing them to fly away. Therefore, no birds are left. But if the question is literal, then 9. But the riddle is designed to trick you into thinking it's 9, but the actual answer is zero.

So I should present both possibilities and explain the trick.
</think>

这是一个经典的脑筋急转弯问题,答案取决于你的理解角度:

1. **字面计算**:
   如果只是简单地减去被打死的鸟,答案是 **9只**(10 - 1 = 9)。

2. **逻辑陷阱**:
   当枪声响起时,其余的鸟会被惊吓飞走,因此**剩下的鸟数量是0只**。

**答案通常是0**,因为问题的意图是考察“逻辑思维”而非数学计算。类似的问题还有:“树上有3只鸟,打死了1只,剩下几只?”答案可能是“0只”(其他鸟飞走了)或“2只”(假设鸟不会飞)。但根据常见设定,答案是 **0只**。

不过,如果问题允许直接数学计算,答案是 **9只**。

> how many e in Qwen
<think>
Okay, the user is asking "how many e in Qwen". Let me check the spelling of Qwen. Q-w-e-n. So the letters are Q, W, E, N. So there is one E. But maybe they're asking about the number of E's in the name "Qwen". Let me confirm. The name Qwen is spelled Q-W-E-N. So there's one E. But maybe the user is confused or there's a typo. Alternatively, if they meant "Qwen" as in the model name, but the model is called Qwen, so the letters are Q, W, E, N. So one E. But maybe they're asking about the number of E's in the word "Qwen" when written in lowercase or uppercase. But regardless, it's just one E. So the answer is 1. But I need to make sure there's no other E's. Let me double-check. Q-W-E-N. Yes, only one E. So the answer is 1. But maybe the user is tricking me? Or maybe they meant something else. But I think the answer is 1.
</think>

在单词 **"Qwen"** 中,字母 **"e"** 出现 **1次**。

拼写为:**Q-W-E-N**,其中只有一个 **E**。

如果问题有其他隐含含义(例如英文单词或特定场景),请进一步说明,我会调整回答! 😊

"""

Generate the model

https://github.com/intel/auto-round/pull/628 this pr need to be merged first

import  torch
from transformers import  AutoModelForCausalLM,AutoTokenizer
from auto_round import AutoRound

model_name = "Qwen/Qwen3-30B-A3B"
model = AutoModelForCausalLM.from_pretrained(model_name,
                                             device_map="cpu", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)
layer_config = {}
for n,m in model.named_modules():
    if isinstance(m, torch.nn.Embedding):
        layer_config[n]={"bits":4}
    elif isinstance(m,torch.nn.Linear) and not "expert" in n and n != "lm_head":
        layer_config[n] = {"bits": 4}

autoround = AutoRound(model,tokenizer,iters=0,layer_config=layer_config)
autoround.quantize_and_save("./tmp_autoround",format="gguf:q2_k_s")

Ethical Considerations and Limitations

The model can produce factually incorrect output, and should not be relied on to produce factually accurate information. Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs.

Therefore, before deploying any applications of the model, developers should perform safety testing.

Caveats and Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

Here are a couple of useful links to learn more about Intel's AI software:

  • Intel Neural Compressor link

Disclaimer

The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please consult an attorney before using this model for commercial purposes.

Cite

@article{cheng2023optimize, title={Optimize weight rounding via signed gradient descent for the quantization of llms}, author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao and Liu, Yi}, journal={arXiv preprint arXiv:2309.05516}, year={2023} }

arxiv github

Downloads last month
25
GGUF
Model size
30.5B params
Architecture
qwen3moe
Hardware compatibility
Log In to view the estimation

2-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Intel/Qwen3-30B-A3B-q2ks-mixed-AutoRound-inc-v0

Finetuned
Qwen/Qwen3-30B-A3B
Quantized
(91)
this model