Uploaded model

  • Developed by: student-abdullah
  • License: apache-2.0
  • Quantized from model: Qwen2.5-Coder-0.5B
  • Created on: 06th July, 2025

Acknowledgement


Quantization Description

This model is quantized using selective quantization from the Qwen2.5-Coder-0.5B base model to increase its speed while preserving the capabilities in generating relevant and accurate responses related python programming. The quantization method included 32-bit quantization of the following Layers:

  • q_proj
  • k_proj
  • v_proj
  • o_proj
  • down_proj
  • gate_proj
  • up_proj
  • lm_head

Rest of the remaining layers were quantized to q3_k_l


Model Description

Layer Name Role (Short) Type
q_proj, k_proj, v_proj Compute query, key, and value for attention mechanism Attention Proj
o_proj Projects attention output back to model hidden size Attention Proj
down_proj Projects MLP output down to hidden size MLP
gate_proj First part of Gated MLP, controls info flow MLP
up_proj Expands hidden size in MLP MLP
lm_head Final linear layer for logits Output Head
embed_tokens Token embedding layer Input Embed
norm Final layernorm Normalization
*_layernorm Normalize inputs to layers Normalization

Model Architect

Qwen2ForCausalLM(
  (model): Qwen2Model(
    (embed_tokens): Embedding(151936, 896, padding_idx=151665)
    (layers): ModuleList(
      (0-23): 24 x Qwen2DecoderLayer(
        (self_attn): Qwen2Attention(
          (q_proj): Linear(in_features=896, out_features=896, bias=True)
          (k_proj): Linear(in_features=896, out_features=128, bias=True)
          (v_proj): Linear(in_features=896, out_features=128, bias=True)
          (o_proj): Linear(in_features=896, out_features=896, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): Qwen2MLP(
          (gate_proj): Linear(in_features=896, out_features=4864, bias=False)
          (up_proj): Linear(in_features=896, out_features=4864, bias=False)
          (down_proj): Linear(in_features=4864, out_features=896, bias=False)
          (act_fn): SiLU()
        )
        (input_layernorm): Qwen2RMSNorm((896,), eps=1e-06)
        (post_attention_layernorm): Qwen2RMSNorm((896,), eps=1e-06)
      )
    )
    (norm): Qwen2RMSNorm((896,), eps=1e-06)
    (rotary_emb): LlamaRotaryEmbedding()
  )
  (lm_head): Linear(in_features=896, out_features=151936, bias=False)
)

Performance & Limitations

  • YET TO BE EXAMINED

Model Performace Evaluation:

  • YET TO BE EVALUATED

Downloads last month
3
GGUF
Model size
494M params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for student-abdullah/Quantized_Qwen-2.5-Coding-0.5B_icl_mlp

Base model

Qwen/Qwen2.5-0.5B
Quantized
(28)
this model