nvidia/OpenCodeReasoning-Nemotron-1.1
					Collection
				
OpenCodeReasoning-Nemotron-1.1 models (7B, 14B, 32B) in MLC format at various quants.
					• 
				15 items
				• 
				Updated
					
				
This is the OpenCodeReasoning-Nemotron-1.1-7B model in MLC format q4f16_1.
The model can be used with MLC-LLM and WebLLM.
Before using the examples, please follow the installation guide.
mlc_llm chat HF://JackBinary/OpenCodeReasoning-Nemotron-1.1-7B-q4f16_1-MLC
mlc_llm serve HF://JackBinary/OpenCodeReasoning-Nemotron-1.1-7B-q4f16_1-MLC
from mlc_llm import MLCEngine
model = "HF://JackBinary/OpenCodeReasoning-Nemotron-1.1-7B-q4f16_1-MLC"
engine = MLCEngine(model)
for response in engine.chat.completions.create(
    messages=[{"role": "user", "content": "What is the meaning of life?"}],
    model=model,
    stream=True,
):
    for choice in response.choices:
        print(choice.delta.content, end="", flush=True)
print("\n")
engine.terminate()
For more on MLC LLM, visit the documentation and GitHub repo.
Base model
Qwen/Qwen2.5-7B