bnjmnmarie commited on
Commit
10f53c6
·
verified ·
1 Parent(s): 513021b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -3
README.md CHANGED
@@ -1,3 +1,29 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - Qwen/Qwen3-14B
5
+ tags:
6
+ - llm-compressor
7
+ datasets:
8
+ - HuggingFaceH4/ultrachat_200k
9
+ ---
10
+ This is [Qwen/Qwen3-14B](https://huggingface.co/Qwen/Qwen3-14B) quantized with [LLM Compressor](https://github.com/vllm-project/llm-compressor) in 4-bit (NVFP4), weights and activations.
11
+ The calibration step used 512 samples of up to 2048 tokens, chat template applied, from [HuggingFaceH4/ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k).
12
+
13
+ The quantization has been done, tested, and evaluated by The Kaitchup.
14
+ The model is compatible with vLLM. Use a Blackwell GPU to get >2x throughput.
15
+
16
+ More details in this article:
17
+ [NVFP4: Same Accuracy with 2.3x Higher Throughput for 4-Bit LLMs](https://kaitchup.substack.com/p/nvfp4-same-accuracy-with-23-higher)
18
+
19
+
20
+
21
+ - **Developed by:** [The Kaitchup](https://kaitchup.substack.com/)
22
+ - **License:** Apache 2.0 license
23
+
24
+ ## How to Support My Work
25
+ Subscribe to [The Kaitchup](https://kaitchup.substack.com/subscribe).
26
+ Or, for a one-time contribution, here is my ko-fi link: [https://ko-fi.com/bnjmn_marie](https://ko-fi.com/bnjmn_marie)
27
+
28
+
29
+ This helps me a lot to continue quantizing and evaluating models for free.