OPEA
/

GGUF
Inference Endpoints
conversational

Model Details

This gguf model is an int4 model with group_size 32 and symmetric quantization of PowerInfer/SmallThinker-3B-Preview generated by intel/auto-round.

How To Use

Requirements

Please follow the Build llama.cpp locally to install the necessary dependencies.

INT4 Inference

> text="为什么企鹅没有被北极熊吃掉?" 
> ./llama-cli -m SmallThinker-3B-Preview-3.1B-Q4_0.gguf  -p "Human:You are a helpful and harmless assistant.You should think step-by-step.$text\nAssistant:\n"  -n 512

# INT4:
# 我正在尝试理解为什么企鹅没有被北极熊吃掉。这是一个有趣的问题,让我思考一下。
# 首先,我需要了解企鹅和北极熊的生态环境。企鹅主要分布在南半球,特别是在南极和南美洲的西部地区。而北极熊则主要生活在北极地区,包括加拿大、俄罗斯、挪威和美国的北部地区。
# 由于企鹅和北极熊的生态环境不同,它们之间很少有直接的接触。然而,如果企鹅和北极熊的生态环境重叠,那么企鹅可能会面临被北极熊捕食的风险。
# 让我想想这个问题。企鹅和北极熊之间的生态环境重叠的概率可能非常低。即使在某些情况下,企鹅和北极熊可能会相遇,但企鹅通常比北极熊更快、更灵活,它们的飞行能力可能也是北极熊难以追上的一大优势。
# 此外,北极熊主要捕食海豹、海狮等海洋动物,而企鹅是陆生动物,它们之间的食物链关系可能并不直接。因此,企鹅可能不会成为北极熊的食物。
# 让我总结一下我的思考过程。企鹅和北极熊的生态环境不同,它们之间的直接接触概率很低。企鹅比北极熊更快、更灵活,它们的飞行能力使它们难以被追上。此外,企鹅和北极熊之间的食物链关系不直接,因此企鹅不会成为北极熊的食物。
# 因此,我认为企鹅没有被北极熊吃掉的原因是它们的生态环境不同,接触概率很低,以及企鹅的飞行能力使它们难以被追上。同时,企鹅和北极熊之间的食物链关系不直接,也是企鹅不会成为北极熊的食物的一个重要原因。
# 最终答案:企鹅没有被北极熊吃掉是因为它们的生态环境不同,接触概率很低,且企鹅的飞行能力使它们难以被追上,同时企鹅和北极熊之间的食物链关系不直接。

# BF16:
# 我正在尝试理解为什么企鹅不会被北极熊吃掉。这是一个有趣的问题,涉及到不同物种之间的生态关系和生存策略。
# 首先,我需要了解企鹅和北极熊的生态环境。企鹅主要分布在南半球,特别是南极及其周边地区,那里有丰富的海洋资源。而北极熊则生活在北冰洋周围的冰雪覆盖的岛屿和海岸线上,以海豹等海洋动物为食。
# 接下来,我应该考虑这两个物种之间的直接竞争或捕食关系。由于它们分别位于南半球和北半球,它们的活动区域并没有重叠,因此在现实中,企鹅和北极熊之间不存在直接的捕食关系。这意味着,从生态位的角度来看,它们不是竞争对手,也没有直接的捕食链联系。
# 此外,我还应该思考一下气候变化对这些生态系统的影响。随着全球气候变暖,极地地区的冰川融化,可能会导致北极熊的食物来源减少,甚至影响到它们的数量。然而,这并不意味着北极熊会去攻击企鹅,因为它们的生存需求是不同的。
# 再者,我需要考虑一下生物多样性和生态平衡的概念。在一个健康的生态系统中,每个物种都有其存在的意义和价值,它们共同维持着生态系统的稳定。如果北极熊大量捕食企鹅,可能会破坏这一地区的生态平衡,但实际情况并非如此。
# 另外,我也在想,是否存在其他因素限制了北极熊捕食企鹅的可能性。例如,地理上的隔离、行为习性的差异或者生理上的适应性差异都可能成为障碍。
# 综上所述,我认为企鹅不会被北极熊吃掉的主要原因是它们的生态位分离,以及它们之间的捕食关系不存在。尽管存在气候变化等因素,但就目前而言,企鹅和北极熊之间不存在直接的捕食威胁。
# **最终答案**
# \[ \boxed{\text{企鹅和北极熊的生态位分离,以及不存在直接的捕食关系,使得它们之间不存在捕食威胁。}} \]

text="树枝上有十只鸟,如果你射杀了一只,还剩下几只?请用中文回答"
# INT4:
# 我正在尝试解决这个问题。所以,我需要知道如果树枝上有十只鸟,射杀了一只,还剩下几只。看起来这个问题有点简单,但还是需要仔细思考一下。
# 首先,我应该明确一下问题的条件。树枝上有十只鸟,这意味着总数是十。然后,我射杀了一只鸟,这意味着我减少了总数。所以,总数应该是原来的十减去我射杀的那只。
# 让我一步一步来分析。总鸟数是十,减去我射杀的一只,那么剩下的鸟数应该是九。所以,答案应该是九。
# 但是,我再想一下。鸟是动物,射杀了一只,是不是意味着我会因为射杀鸟而受到法律的惩罚?这个问题虽然有趣,但可能不是这个问题的重点。因为这个问题似乎是在一个更抽象的层面上在问。
# 另外,我有没有考虑鸟的种类或者它们的行为?例如,鸟会不会因为一只被射杀而变得不再害怕,或者它们会不会一起飞走?这些因素可能会影响总数,但在这个问题中,似乎没有考虑这些因素。
# 所以,基于最初的条件,如果树枝上有十只鸟,射杀了一只,还剩下九只。这就是答案。
# Final Answer: 九只。

# BF16:
# 让我想想这个问题。题目说树枝上有十只鸟,如果我射杀了其中的一只,那么剩下的鸟应该是多少呢?
# 首先,我需要明确一下问题的条件。树枝上原来有十只鸟,这是已知的数量。然后,我射杀了其中的一只,这意味着鸟的数量减少了一个人。
# 所以,从总数的角度来看,十减去一等于九。也就是说,剩下的鸟应该是九只。
# 但是,我再想一下,有没有可能其他因素影响这个结果。比如,鸟会不会因为害怕而飞走,或者有其他的鸟飞来填补空缺。不过,根据题目的描述,似乎没有提到这些情况,所以我认为最直接的答案就是九只。
# 另外,我也考虑过是否要考虑鸟的种类、年龄或者其他特征,但我觉得这些都不重要,因为题目只是在问数量上的减少。
# 所以,我的结论是,射杀了一只鸟后,还剩下九只鸟。
# **最终答案**
# \[ \boxed{9} \]

text="How many r in strawberry."
# INT4:
# To answer this, I need to find out how many r's are in the word "strawberry." Let me think about this carefully.
# First, I should understand what the question is really asking. It's not about the number of letters in the word, but about the number of times the letter "r" appears in it. So, I'm looking to count the occurrences of "r" in "strawberry."
# Let me break it down step by step. I'll go through the word "strawberry" one letter at a time, and every time I see the letter "r," I'll count it.
# 1. s - no r
# 2. t - no r
# 3. a - no r
# 4. w - no r
# 5. a - no r
# 6. r - count this one (1st r)
# 7. b - no r
# 8. b - no r
# 9. y - no r
# Wait a minute, let me check the word again to make sure I didn't miss anything. Yes, it's still the same. Now, let me count the "r"s I've found so far. I have one "r" in the word "strawberry."
# Let me think about this for a second. Is there any other "r" in the word? No, I don't see it. The word "strawberry" has only one "r."
# Fantastic! After carefully going through the word "strawberry," I can confidently say that there is only one "r" in it.
# Final Answer: 1. r. [end of text]

# BF16:
# So I have this question: "How many R's are there in a strawberry?" Hmm, that seems a bit odd because strawberries aren't made up of the letter 'R'. So, maybe it's not about counting letters but something else. Perhaps it's a typo or misphrased.
# Let me think differently. Maybe it's asking for the number of times the letter 'R' appears in the word "strawberry". That would make more sense. Let's try that.
# First, I need to write out the word "strawberry" and count how many times the letter 'R' shows up.
# S-T-R-A-W-B-E-R-R-Y
# Okay, so starting from the beginning:
# - S: no R
# - T: no R
# - R: one R
# - A: no R
# - W: no R
# - B: no R
# - E: no R
# - R: another R
# - R: yet another R
# - R: and one more R
# - Y: no R
# So, let's count them: R is at positions 3, 6, 7, and 8. That's four R's.
# Wait, but I remember that sometimes people might consider different forms or variations, like uppercase and lowercase. But since we're dealing with English words, typically, we don't differentiate between uppercase and lowercase when counting letters unless specified otherwise.
# Also, I should check if there are any other instances where 'R' could appear, like in contractions or abbreviations within the word. For example, "strawberry" doesn't contain any contractions, so I'm safe there.
# Alternatively, maybe the question is about the number of syllables in "strawberry" that start with 'R', but that seems unlikely because only one syllable starts with 'R'.
# Another thought: perhaps it's about the number of vowels in "strawberry", but again, that doesn't relate to 'R'.
# Wait, maybe it's about the number of times the letter 'R' is used in a recipe for making strawberries. But that seems too broad and not specific enough for a mathematical problem.
# Or perhaps it's related to the nutritional content of strawberries, but that doesn't involve counting letters either.
# I think my initial approach is correct: counting the occurrences of the letter 'R' in the word "strawberry".
# So, to confirm:
# S-T-R-A-W-B-E-R-R-Y
# Positions:
# 1. S

text="Which number is bigger, 9.11 or 9.8?"
# INT4:
# To answer this, I need to compare two numbers, 9.11 and 9.8, to determine which one is bigger. Let me think about this carefully.
# First, I should consider the structure of the numbers. Both numbers have a single digit before the decimal point, which represents the tens place. In this case, the tens digit is 9 for both numbers. This means that both numbers are equally close to 10, but one digit after the decimal point is different.
# Let me break this down further. The number 9.11 has a 1 in the tenths place, while the number 9.8 has an 8 in the tenths place. The tenths digit is crucial because it indicates the first digit after the decimal point, which affects the overall value of the number.
# Now, let me compare the tenths digits: 1 in 9.11 and 8 in 9.8. The tenths digit in 9.8 is larger than the tenths digit in 9.11. This means that 9.8 is a larger number than 9.11.
# Wait a minute, let me double-check this comparison to ensure I haven't missed anything. The tens digit is the same for both numbers, but the tenths digit in 9.8 is greater. This confirms that 9.8 is indeed larger than 9.11.
# Fantastic! After this careful consideration, I can confidently say that 9.8 is bigger than 9.11.
# Final Answer: 9.8 is bigger than 9.11. [end of text]

# BF16:
# So I have these two numbers: 9.11 and 9.8. I need to figure out which one is bigger. At first glance, they look pretty similar, but there's a difference in the digits after the decimal point.
# Let me think about this step by step. Both numbers are decimals with one digit before the decimal point and two digits after it. So, they're both tenths and hundredths.
# First, I can compare the whole number parts. In both cases, the whole number part is 9. Since 9 is equal to 9, that means so far, they're tied at the tenths place.
# Now, let's look at the decimal places. Starting from the right, the first digit after the decimal is the tenths place, and the next one is the hundredths place.
# In 9.11, the tenths digit is 1, and the hundredths digit is 1.
# In 9.8, the tenths digit is 8, and the hundredths digit is 0.
# So, comparing the tenths place:
# - 9.11 has 1 in the tenths place.
# - 9.8 has 8 in the tenths place.
# Since 8 is greater than 1, that means 9.8 is larger than 9.11.
# Wait a minute, just to make sure I'm not missing anything, maybe I should consider the entire number.
# Another way to think about it is to convert these decimals into fractions or to understand their values in terms of money or something familiar.
# For example, if I imagine these as amounts of money, 9.11 could be $9.11 and 9.8 could be $9.80. Clearly, $9.80 is more than $9.11 because 80 cents is more than 11 cents.
# Alternatively, I could subtract one from the other to see which one is bigger.
# Let's try subtracting 9.11 from 9.8.
# 9.8 - 9.11 = ?
# Well, subtracting the decimals directly:
# 8 - 11 would require borrowing, but since 8 is less than 11, I need to borrow from the whole number.
# So, 9 becomes 8, and 11 becomes 21.
# Then, 8 - 11 in the hundredths place is still negative, so I need to borrow again.
# But

Evaluate the model

pip3 install lm-eval==0.4.5

Convert the gguf model to hf model using the command:

from transformers import AutoModelForCausalLM,AutoTokenizer

model_id = "OPEA/SmallThinker-3B-Preview-int4-sym-gguf-q4-0-inc"
filename = "SmallThinker-3B-Preview-3.1B-Q4_0.gguf"

tokenizer = AutoTokenizer.from_pretrained(model_id, gguf_file=filename)
model = AutoModelForCausalLM.from_pretrained(model_id, gguf_file=filename)

model.save_pretrained("SmallThinker-3B-Preview-w4g32-gguf-q4-0-to-hf")
tokenizer.save_pretrained("SmallThinker-3B-Preview-w4g32-gguf-q4-0-to-hf")
auto-round --model "SmallThinker-3B-Preview-w4g32-gguf-q4-0-to-hf" --eval --eval_bs 16  --tasks leaderboard_ifeval,leaderboard_mmlu_pro,gsm8k,lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,cmmlu,mmlu,ceval-valid
Metric BF16 INT4
Avg 0.5594 0.5556
leaderboard_mmlu_pro 5 shots 0.3157 0.2926
leaderboard_ifeval inst_level_strict_acc 0.5132 0.5036
leaderboard_ifeval prompt_level_strict_acc 0.3678 0.3512
mmlu 0.6134 0.6105
cmmlu 0.6685 0.6471
ceval-valid 0.6664 0.6412
gsm8k 5 shots 0.7377 0.7779
lambada_openai 0.5911 0.5890
hellaswag 0.5384 0.5314
winogrande 0.6472 0.6504
piqa 0.7492 0.7486
truthfulqa_mc1 0.3550 0.3501
openbookqa 0.2940 0.2906
boolq 0.7713 0.7700
arc_easy 0.7226 0.7239
arc_challenge 0.3985 0.4113

Generate the model

Here is the sample command to generate the model.

auto-round \
--model  PowerInfer/SmallThinker-3B-Preview \
--device 0 \
--group_size 32 \
--bits 4 \
--disable_eval \
--iter 1000 \
--nsample 512 \
--format 'gguf:q4_0' \
--output_dir "./tmp_autoround" 

Ethical Considerations and Limitations

The model can produce factually incorrect output, and should not be relied on to produce factually accurate information. Because of the limitations of the pretrained model and the finetuning datasets, it is possible that this model could generate lewd, biased or otherwise offensive outputs.

Therefore, before deploying any applications of the model, developers should perform safety testing.

Caveats and Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

Here are a couple of useful links to learn more about Intel's AI software:

  • Intel Neural Compressor link

Disclaimer

The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please consult an attorney before using this model for commercial purposes.

Cite

@article{cheng2023optimize, title={Optimize weight rounding via signed gradient descent for the quantization of llms}, author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao and Liu, Yi}, journal={arXiv preprint arXiv:2309.05516}, year={2023} }

arxiv github

Downloads last month
44
GGUF
Model size
3.09B params
Architecture
qwen2

4-bit

Inference API
Unable to determine this model's library. Check the docs .

Model tree for OPEA/SmallThinker-3B-Preview-int4-sym-gguf-q4-0-inc

Base model

Qwen/Qwen2.5-3B
Quantized
(21)
this model

Dataset used to train OPEA/SmallThinker-3B-Preview-int4-sym-gguf-q4-0-inc