ReelevateLM-q4f16

This is the Meta Llama 3.1 Instruct model fine‑tuned with LoRA and converted to MLC format q4f16_1.

The model can be used in:

Example Usage

Before running any examples, install MLC LLM by following the installation documentation.

Chat (CLI)

mlc_llm chat HF://pr0methium/ReelevateLM-q4f16_1

REST Server

mlc_llm serve HF://pr0methium/ReelevateLM-q4f16_1

Python API

from mlc_llm import MLCEngine

model = "HF://pr0methium/ReelevateLM-q4f16_1"
engine = MLCEngine(model)

for response in engine.chat.completions.create(
    messages=[{"role": "user", "content": "Write me a 30 second reel story…"}],
    model=model,
    stream=True,
):
    for choice in response.choices:
        print(choice.delta.content, end="", flush=True)
print()

engine.terminate()

Documentation

For more information on the MLC LLM project, please visit the docs and the GitHub repo.

Downloads last month
23
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pr0methium/ReelevateLM-q4f16_1

Quantized
(445)
this model