ArtusDev commited on
Commit
2dacc89
·
verified ·
1 Parent(s): ea24ef8

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +250 -0
README.md ADDED
@@ -0,0 +1,250 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: nvidia/Qwen-3-Nemotron-32B-GenRM-Principle
3
+ base_model_relation: quantized
4
+ quantized_by: ArtusDev
5
+ ---
6
+ <style>
7
+ .container-dark {
8
+ font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif;
9
+ line-height: 1.6;
10
+ color: #d4d4d4;
11
+ }
12
+ a {
13
+ color: #569cd6;
14
+ text-decoration: none;
15
+ font-weight: 600;
16
+ }
17
+ a:hover {
18
+ text-decoration: underline;
19
+ }
20
+ .card-dark {
21
+ background-color: #252526;
22
+ border-radius: 12px;
23
+ padding: 24px;
24
+ margin-bottom: 20px;
25
+ box-shadow: 0 4px 12px rgba(0,0,0,0.3);
26
+ border: 1px solid #3c3c3c;
27
+ }
28
+ .card-dark h1 {
29
+ font-size: 2.2em;
30
+ color: #ffffff;
31
+ text-align: center;
32
+ margin-bottom: 10px;
33
+ }
34
+ .card-dark.card-dark-title h1 {
35
+ font-size: 1.5em;
36
+ }
37
+ .card-dark .subtitle {
38
+ text-align: center;
39
+ font-size: 1.1em;
40
+ color: #a0a0a0;
41
+ }
42
+ .card-dark h2 {
43
+ font-size: 1.5em;
44
+ margin-top: 0;
45
+ padding-bottom: 10px;
46
+ border-bottom: 1px solid #3c3c3c;
47
+ color: #c586c0;
48
+ }
49
+ .card-dark h3 {
50
+ font-size: 1.2em;
51
+ color: #d4d4d4;
52
+ }
53
+ .styled-table {
54
+ display: table;
55
+ border: none;
56
+ width: 100%;
57
+ font-size: 0.95em;
58
+ margin-bottom: 0px;
59
+ }
60
+ .styled-table thead th {
61
+ background-color: #333333;
62
+ color: #c586c0;
63
+ text-align: left;
64
+ }
65
+ .styled-table th {
66
+ padding: 12px 15px;
67
+ }
68
+ .styled-table td {
69
+ padding: 0;
70
+ }
71
+ .styled-table table, .styled-table th, .styled-table td {
72
+ border-left: none;
73
+ border-right: none;
74
+ border-bottom: none;
75
+ }
76
+ .styled-table td {
77
+ border-bottom: 1px solid #3c3c3c;
78
+ }
79
+ .styled-table tbody tr {
80
+ transition: background-color 0.1s ease;
81
+ }
82
+ .styled-table tbody tr:hover {
83
+ background-color: #3a3a3a;
84
+ }
85
+ .styled-table tr:last-child td {
86
+ border-bottom: none;
87
+ }
88
+ .styled-table td a {
89
+ display: block;
90
+ padding: 12px 15px;
91
+ }
92
+ .styled-table td a.fake-link {
93
+ text-decoration:none;
94
+ color:inherit;
95
+ }
96
+ details {
97
+ margin-top: 20px;
98
+ border: 1px solid #3c3c3c;
99
+ border-radius: 8px;
100
+ overflow: hidden;
101
+ }
102
+ summary {
103
+ cursor: pointer;
104
+ padding: 12px 18px;
105
+ background-color: #6A5ACD;
106
+ font-weight: 600;
107
+ display: flex;
108
+ align-items: center;
109
+ gap: 10px;
110
+ justify-content: space-between;
111
+ list-style: none;
112
+ }
113
+ summary::-webkit-details-marker {
114
+ display: none;
115
+ }
116
+ summary:hover {
117
+ filter: brightness(1.1);
118
+ }
119
+ summary::after {
120
+ content: '';
121
+ display: inline-block;
122
+ width: 8px;
123
+ height: 8px;
124
+ border-bottom: 2px solid white;
125
+ border-right: 2px solid white;
126
+ transform: rotate(45deg);
127
+ transition: transform 0.3s ease;
128
+ }
129
+ details[open] > summary::after {
130
+ transform: rotate(225deg);
131
+ }
132
+ .details-content {
133
+ padding: 18px;
134
+ }
135
+ .btn-purple {
136
+ display: inline-block;
137
+ background-color: #6A5ACD;
138
+ color: white !important;
139
+ padding: 12px 24px;
140
+ border-radius: 8px;
141
+ text-decoration: none;
142
+ font-weight: 600;
143
+ transition: background-color 0.3s ease, transform 0.2s ease;
144
+ text-align: center;
145
+ }
146
+ .btn-purple:hover {
147
+ background-color: #7B68EE;
148
+ transform: translateY(-2px);
149
+ }
150
+ </style>
151
+
152
+ <div class="container-dark">
153
+
154
+ <div class="card-dark card-dark-title">
155
+ <h1>ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3</h1>
156
+ <p class="subtitle">
157
+ EXL3 quants of <a href="https://huggingface.co/nvidia/Qwen-3-Nemotron-32B-GenRM-Principle" target="_blank">nvidia/Qwen-3-Nemotron-32B-GenRM-Principle</a> using <a href="https://github.com/turboderp-org/exllamav3/" target="_blank">exllamav3</a> for quantization.
158
+ </p>
159
+ </div>
160
+
161
+ <div class="card-dark">
162
+ <h2>Quants</h2>
163
+ <table class="styled-table">
164
+ <thead>
165
+ <tr>
166
+ <th>Quant</th>
167
+ <th>BPW</th>
168
+ <th>Head Bits</th>
169
+ <th>Size (GB)</th>
170
+ </tr>
171
+ </thead>
172
+ <tbody>
173
+ <tr>
174
+ <td><a href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/2.5bpw_H6" target="_blank">2.5_H6</a></td>
175
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/2.5bpw_H6" target="_blank">2.5</a></td>
176
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/2.5bpw_H6" target="_blank">6</a></td>
177
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/2.5bpw_H6" target="_blank">11.93</a></td>
178
+ </tr>
179
+ <tr>
180
+ <td><a href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/3.0bpw_H6" target="_blank">3.0_H6</a></td>
181
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/3.0bpw_H6" target="_blank">3.0</a></td>
182
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/3.0bpw_H6" target="_blank">6</a></td>
183
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/3.0bpw_H6" target="_blank">13.88</a></td>
184
+ </tr>
185
+ <tr>
186
+ <td><a href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/3.5bpw_H6" target="_blank">3.5_H6</a></td>
187
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/3.5bpw_H6" target="_blank">3.5</a></td>
188
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/3.5bpw_H6" target="_blank">6</a></td>
189
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/3.5bpw_H6" target="_blank">15.83</a></td>
190
+ </tr>
191
+ <tr>
192
+ <td><a href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/4.0bpw_H6" target="_blank">4.0_H6</a></td>
193
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/4.0bpw_H6" target="_blank">4.0</a></td>
194
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/4.0bpw_H6" target="_blank">6</a></td>
195
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/4.0bpw_H6" target="_blank">17.78</a></td>
196
+ </tr>
197
+ <tr>
198
+ <td><a href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/4.5bpw_H6" target="_blank">4.5_H6</a></td>
199
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/4.5bpw_H6" target="_blank">4.5</a></td>
200
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/4.5bpw_H6" target="_blank">6</a></td>
201
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/4.5bpw_H6" target="_blank">19.73</a></td>
202
+ </tr>
203
+ <tr>
204
+ <td><a href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/5.0bpw_H6" target="_blank">5.0_H6</a></td>
205
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/5.0bpw_H6" target="_blank">5.0</a></td>
206
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/5.0bpw_H6" target="_blank">6</a></td>
207
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/5.0bpw_H6" target="_blank">21.68</a></td>
208
+ </tr>
209
+ <tr>
210
+ <td><a href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/6.0bpw_H6" target="_blank">6.0_H6</a></td>
211
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/6.0bpw_H6" target="_blank">6.0</a></td>
212
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/6.0bpw_H6" target="_blank">6</a></td>
213
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/6.0bpw_H6" target="_blank">25.58</a></td>
214
+ </tr>
215
+ <tr>
216
+ <td><a href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/8.0bpw_H8" target="_blank">8.0_H8</a></td>
217
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/8.0bpw_H8" target="_blank">8.0</a></td>
218
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/8.0bpw_H8" target="_blank">8</a></td>
219
+ <td><a class="fake-link" href="https://huggingface.co/ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3/tree/8.0bpw_H8" target="_blank">33.57</a></td>
220
+ </tr>
221
+ </tbody>
222
+ </table>
223
+ </div>
224
+
225
+ <div class="card-dark">
226
+ <h2>How to Download and Use Quants</h2>
227
+ <p>You can download quants by targeting specific size using the Hugging Face CLI.</p>
228
+ <details>
229
+ <summary>Click for download commands</summary>
230
+ <div class="details-content">
231
+ <b>1. Install huggingface-cli:</b>
232
+ <pre><code>pip install -U "huggingface_hub[cli]"</code></pre>
233
+ <b>2. Download a specific quant:</b>
234
+ <pre><code>huggingface-cli download ArtusDev/nvidia_Qwen-3-Nemotron-32B-GenRM-Principle-EXL3 --revision "5.0bpw_H6" --local-dir ./</code></pre>
235
+ </div>
236
+ </details>
237
+ <p>EXL3 quants can be run with any inference client that supports EXL3, such as <a href="https://github.com/theroyallab/tabbyapi" target="_blank"><b>TabbyAPI</b></a>. Refer to <a href="https://github.com/theroyallab/tabbyAPI/wiki/01.-Getting-Started" target="_blank">documentation</a> for set up instructions.</p>
238
+ </div>
239
+
240
+ <div class="card-dark">
241
+ <h2>Quant Requests</h2>
242
+ <div style="text-align: center; margin-top: 25px;">
243
+ <a href="https://huggingface.co/ArtusDev/requests-exl/discussions/new?title=[MODEL_NAME_HERE]&description=[MODEL_HF_LINK_HERE]" class="btn-purple" target="_blank">Request EXL3 Quants</a>
244
+ </div>
245
+ <p class="subtitle">
246
+ See <a href="https://huggingface.co/ArtusDev/requests-exl" target="_blank">EXL community hub</a> for request guidelines.
247
+ </p>
248
+ </div>
249
+
250
+ </div>