--- license: mit base_model: - microsoft/bitnet-b1.58-2B-4T pipeline_tag: text-generation base_model_relation: quantized tags: - bitnet - ik_llama.cpp --- The IQ2_BN and IQ2_BN_R4 version of [microsoft/bitnet-b1.58-2B-4T-gguf](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T) for use with ik_llama.cpp. I recommend the IQ2_BN_R4 version but you use `-rtr` on IQ2_BN to convert on runtime. The chat template in the model looks incorrect (I did not change it, this is from the original Microsoft GGUF). An example of correct usage from their [transformers PR](https://github.com/huggingface/transformers/pull/37503/files): `<|begin_of_text|>User: Hey, are you conscious? Can you talk to me?<|eot_id|>Assistant:` I was able to follow the example above and it worked for multi-turn conversations.