boboliu commited on
Commit
24f8637
·
verified ·
1 Parent(s): 7a657e4

Delete .ipynb_checkpoints

Browse files
.ipynb_checkpoints/README-checkpoint.md DELETED
@@ -1,25 +0,0 @@
1
- ---
2
- license: apache-2.0
3
- base_model:
4
- - Qwen/Qwen3-Reranker-0.6B
5
- pipeline_tag: text-classification
6
- tags:
7
- - transformers
8
- ---
9
- # Qwen3-Reranker-0.6B-W4A16-G128
10
-
11
- GPTQ Quantized [Qwen/Qwen3-Reranker-0.6B](https://huggingface.co/Qwen/Qwen3-Reranker-0.6B) with Ultrachat, [THUIR/T2Ranking](https://huggingface.co/datasets/THUIR/T2Ranking) and [m-a-p/COIG-CQIA](huggingface.co/datasets/m-a-p/COIG-CQIA) for calibration set.
12
-
13
- ## What's the benefit?
14
-
15
- VRAM Usage: `3228M` -> `2124M` (w/o FA2, according to Embedding model's result).
16
-
17
- ## What's the cost?
18
-
19
- I think `<5%` accuracy, further evaluation on the way...
20
-
21
- [The Embedding one](https://huggingface.co/boboliu/Qwen3-Embedding-4B-W4A16-G128#whats-the-cost) shows `~0.7%`.
22
-
23
- ## How to use it?
24
-
25
- `pip install compressed-tensors optimum` and `auto-gptq` / `gptqmodel`, then goto [the official usage guide](https://huggingface.co/Qwen/Qwen3-Reranker-0.6B#transformers-usage).