TensorBlock

Website Twitter Discord GitHub Telegram

ServiceNow-AI/Apriel-Nemotron-15b-Thinker - GGUF

This repo contains GGUF format model files for ServiceNow-AI/Apriel-Nemotron-15b-Thinker.

The files were quantized using machines provided by TensorBlock, and they are compatible with llama.cpp as of commit b5753.

Our projects

Forge
Forge Project
An OpenAI-compatible multi-provider routing layer.
πŸš€ Try it now! πŸš€
Awesome MCP Servers TensorBlock Studio
MCP Servers Studio
A comprehensive collection of Model Context Protocol (MCP) servers. A lightweight, open, and extensible multi-LLM interaction studio.
πŸ‘€ See what we built πŸ‘€ πŸ‘€ See what we built πŸ‘€

Prompt template

<|system|>
You are a thoughtful and systematic AI assistant built by ServiceNow Language Models (SLAM) lab. Before providing an answer, analyze the problem carefully and present your reasoning step by step. After explaining your thought process, provide the final solution in the following format: [BEGIN FINAL RESPONSE] ... [END FINAL RESPONSE].

{system_prompt}
<|end|>
<|user|>
{prompt}
<|end|>
<|assistant|>
Here are my reasoning steps:

Model file specification

Filename Quant type File Size Description
Apriel-Nemotron-15b-Thinker-Q2_K.gguf Q2_K 5.794 GB smallest, significant quality loss - not recommended for most purposes
Apriel-Nemotron-15b-Thinker-Q3_K_S.gguf Q3_K_S 6.706 GB very small, high quality loss
Apriel-Nemotron-15b-Thinker-Q3_K_M.gguf Q3_K_M 7.396 GB very small, high quality loss
Apriel-Nemotron-15b-Thinker-Q3_K_L.gguf Q3_K_L 7.990 GB small, substantial quality loss
Apriel-Nemotron-15b-Thinker-Q4_0.gguf Q4_0 8.606 GB legacy; small, very high quality loss - prefer using Q3_K_M
Apriel-Nemotron-15b-Thinker-Q4_K_S.gguf Q4_K_S 8.663 GB small, greater quality loss
Apriel-Nemotron-15b-Thinker-Q4_K_M.gguf Q4_K_M 9.113 GB medium, balanced quality - recommended
Apriel-Nemotron-15b-Thinker-Q5_0.gguf Q5_0 10.393 GB legacy; medium, balanced quality - prefer using Q4_K_M
Apriel-Nemotron-15b-Thinker-Q5_K_S.gguf Q5_K_S 10.393 GB large, low quality loss - recommended
Apriel-Nemotron-15b-Thinker-Q5_K_M.gguf Q5_K_M 10.655 GB large, very low quality loss - recommended
Apriel-Nemotron-15b-Thinker-Q6_K.gguf Q6_K 12.293 GB very large, extremely low quality loss
Apriel-Nemotron-15b-Thinker-Q8_0.gguf Q8_0 15.919 GB very large, extremely low quality loss - not recommended

Downloading instruction

Command line

Firstly, install Huggingface Client

pip install -U "huggingface_hub[cli]"

Then, downoad the individual model file the a local directory

huggingface-cli download tensorblock/ServiceNow-AI_Apriel-Nemotron-15b-Thinker-GGUF --include "Apriel-Nemotron-15b-Thinker-Q2_K.gguf" --local-dir MY_LOCAL_DIR

If you wanna download multiple model files with a pattern (e.g., *Q4_K*gguf), you can try:

huggingface-cli download tensorblock/ServiceNow-AI_Apriel-Nemotron-15b-Thinker-GGUF --local-dir MY_LOCAL_DIR --local-dir-use-symlinks False --include='*Q4_K*gguf'
Downloads last month
1
GGUF
Model size
15B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for tensorblock/ServiceNow-AI_Apriel-Nemotron-15b-Thinker-GGUF

Quantized
(12)
this model