Nemotron models that have been converted and/or quantized to work well in vLLM
Michael Goin
mgoin
AI & ML interests
LLM inference optimization, compression, quantization, pruning, distillation
Recent Activity
updated
a model
1 day ago
RedHatAI/Mistral-Small-3.2-24B-Instruct-2506-FP8
published
a model
1 day ago
RedHatAI/Mistral-Small-3.2-24B-Instruct-2506-FP8
updated
a model
11 days ago
mgoin/SEMIKONG-70B-W4A16-G128