IQ4_XS size seems low?
Why does the IQ4_XS quant of this 0.6B model clock in at ~367MB? It seems to be ~450MB for other people, see for example:
https://huggingface.co/bartowski/Qwen_Qwen3-0.6B-GGUF/blob/main/Qwen_Qwen3-0.6B-IQ4_XS.gguf
https://huggingface.co/Mungert/Qwen3-0.6B-GGUF/blob/main/Qwen3-0.6B-iq4_xs.gguf
I do not know. We are simply using whatever defaults llama.cpp deems right. At least bartowski often produces non-standard quants, but he usually clearly labels them, so that's probably not it. Could also be a version difference.
I will give them booth a try and see if there is any quality difference.
I just tested booth ouer and bartowski's IQ4_XS quant. While booth loaded and worked on latest llama.cpp ouer quant seams to only generate garbage. Maybe we should try to requant and see if that fixes the issues with ouer IQ4_XS quant.
we should find out what the difference is, e.g. if the original model has changed and we queued it first (or vice versa), or if there was a fix or regression in llama.cpp (that will affect other models as well). just blindly requeing will likely not tell us what went wrong.
Seems the upstream repo has not changed for a while, so it's not a case of the model changing.
I also see no Qwen3-related changes entries in llama.cpp in our version since apr 9.
It also seems to work fine for me, for a 0.6B (tested with the i1-IQ4_XS)
> Hi, tell me a story.
Sure! Here's a short story:
In a small village, a young girl named Lily lived with her parents. One day, a mysterious stranger appeared at their door. They were worried that something terrible could happen. But when the stranger said it was okay, the young girl felt a bit of hope. The stranger gave Lily a little treasure box and said, "I'm not here to hurt you, but I'm here to help you." And with a smile, the stranger left, and the young girl was safe.
This is a very simple and easy story!
> Explain Bernoulli numbers to me.
Bernoulli numbers are a type of special numbers in mathematics. They are a sequence of numbers that can be calculated using a formula, and they have many interesting properties and applications.
1. **Definition**: A Bernoulli number is a number that can be calculated using a formula involving powers of x and an infinite series.
2. **Properties**:
- They are related to the exponential function and other special functions.
- They have a connection to the zeta function and the Riemann zeta function.
- They are used in various mathematical fields, including number theory, analysis, and combinatorics.
3. **Formula**:
For odd positive integers, the n-th Bernoulli number \( B_n \) is given by:
\[ B_n = (-1)^{n+1} \frac{1}{2^{n+1}} \binom{2n}{n} \zeta(2n) \]
Here, \( \zeta(s) \) is the Riemann zeta function, and \( \binom{2n}{n} \) is the binomial coefficient.
4. **Applications**:
- They appear in the study of number theory, particularly in the theory of partitions.
- They are used in the theory of integrals, especially in the evaluation of integrals involving powers of x.
- They have connections to the theory of modular forms and are used in the study of q-series and other q-calculus.
5. **Real-Life Example**:
In some financial models, Bernoulli numbers are used to model the behavior of certain assets, such as stocks or bonds. This is because their properties can help in understanding the statistical behavior of these assets over time.
This is a very basic introduction to Bernoulli numbers! If you have any more questions, feel free to ask.
And conveerrsely, with bartowski's IQ4_XS, I get never-ending output (this is after 4000 tokens when I stopped it):
Alternatively, maybe the problem is that the Bernoulli numbers do not exist for all integers, but that's not true, since they do exist for all integers.
Alternatively, maybe there's a confusion with the sign. For example, the Bernoulli numbers are sometimes given with a different sign convention, but I think that's part of the standard definition.
Wait, perhaps the issue is that the Bernoulli numbers are defined in terms of the Dirichlet eta function, but if that function is being used, there's a different definition. But the standard definition uses the power series expansion.
Alternatively, maybe the issue is with the recurrence relation. The recurrence relation for Bernoulli numbers is B_n = -1/n * sum_{k=1}^{n-1} B_k /k. This is derived from the derivative of the generating function, which leads to the recurrence. Therefore, the recurrence is correct, and the definition is correct. There's no issue with the recurrence relation; it's a standard result.
Therefore, the definition of Bernoulli numbers is correct, and the issue might be elsewhere, such as in the context of the series or other series expansions where they are used. For example, if the user is referring to a specific problem or textbook where there's a problem with the definition, but without more context, it's hard to say.
I notice that I never get <think> tags with ours, but always with bartowskis.
While llama-cli only gives repeating sentences likely because it uses the wrong chat template for a base model with llama-server our model is working perfectly fine:
Prompt: What is the meaning of life?
Assistant, the meaning of life is a complex and deeply personal question. Philosophical perspectives on this topic are diverse and vary widely across different cultures and belief systems. Here are a few prominent ideas:
1. **Existence-Based**: Many people believe that life is an essential part of existence, and the meaning of life is to live a meaningful existence that fulfills one's purpose and contributes to the greater good. This view often aligns with existentialist philosophy, which suggests that life is a fundamental aspect of reality.
2. **Purpose-Based**: Some people believe that life has a specific purpose or goal. Whether that goal is personal, societal, or spiritual, people find meaning in their lives by striving to achieve or fulfill a purpose.
3. **Creativity-Based**: Many people feel a sense of purpose because they enjoy creativity and the ability to create something meaningful. This could be in the form of art, literature, music, or any activity that brings joy and fulfillment.
4. **Spiritual**: For those who identify with a spiritual perspective, the meaning of life may be tied to their belief in a higher power, a spiritual journey, or the pursuit of enlightenment and wisdom.
5. **Societal-Based**: People may also find meaning in their role within society, whether it is a role that is fulfilling or a role that contributes to the greater good of the community.
In conclusion, the meaning of life is a deeply personal question, and there is no single answer that applies to everyone. It is a question that invites reflection and introspection, and it is ultimately up to each individual to find their own answer.
I think this discussion can be closed as it is all about @Mushoz comparing base model with instruction tuned models. I even checked some other Qwen3-0.6B-Base GGUFs on HuggingFace and they are all smaller than the instruction tuned GGUFs and in our size range.
Wow, I thought I explicitly checked that thexy were the same models. Good that you checked again.
I forgot to mention that I used llama-cli --jinja (a reflex for qwen3), so I might have had more luck.