This is a model for testing llama.cpp-based runtimes, the goal is to have the smallest GGUF working file possible.

GGUF

Model size

12.4k params

Architecture

llama

Hardware compatibility

We're not able to determine the quantization variants.

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support