File size: 2,213 Bytes
36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 36aacc6 15ee3b9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
---
tags:
- causal-lm
- qwen
- fine-tuned
- quantized
- mnlp
---
# Qwen3-0.6B Full-Precision + W8A8 Quantized MCQA Model
**Repository:** [Kikinoking/MNLP_M2_quantized_model](https://huggingface.co/Kikinoking/MNLP_M2_quantized_model)
This is a fine-tuned Qwen-3-0.6B causal-LM, trained on a concatenation of multiple MCQA datasets and then quantized to 8-bit weights and activations using the compressed-tensors format. It is designed for multiple-choice QA tasks, evaluated with the LightEval EPFL MNLP suite.
---
## Model Details
- **Base architecture:** Qwen-3 (0.6B parameters)
- **Pretrained checkpoint:** `Qwen/Qwen3-0.6B-Base`
- **Fine-tuning data sources:**
- ScienceQA
- QASC
- OpenBookQA
- MathQA
- CommonsenseQA
- MCQA prompts generated via ChatGPT (labeled `M1_chatgpt`)
- **Dataset split:** 95% train / 5% validation
- **Tokenization:**
- `AutoTokenizer.from_pretrained("Qwen/Qwen3-0.6B-Base")`
- Left padding, EOS token as pad_token
- Sequence length capped at 2048 tokens
---
## Quantization
- **Method:** `compressed-tensors` / `naive-quantized`
- **Precision:** 8-bit weights + 8-bit activations
- **Layers kept in FP32:** Language modeling head
- **Checkpoint:** Compatible with CPU and GPU inference
---
## Evaluation
Tested using LightEval EPFL MNLP on the MCQA task:
```bash
lighteval accelerate --eval-mode lighteval --save-details --override-batch-size 8 --custom-tasks community_tasks/mnlp_mcqa_evals.py --output-dir out/lighteval_quant model_configs/quantized_model.yaml "community|mnlp_mcqa_evals|0|0"
Results:
Accuracy: 0.30 ± 0.15
Normalized Accuracy: 0.30 ± 0.15
How to Use
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained(
"Kikinoking/MNLP_M2_quantized_model", trust_remote_code=True
)
model = AutoModelForCausalLM.from_pretrained(
"Kikinoking/MNLP_M2_quantized_model",
trust_remote_code=True,
device_map="auto",
)
License
Being a 0.6B-parameter model, it may struggle with very long or ambiguous queries.
Quantization can introduce a slight drop in accuracy (~5–10%).
License: CC BY-NC 4.0 (inherits from the base Qwen-3 license)
|