microsoft
/

bitnet-b1.58-2B-4T-bf16

@@ -1,15 +1,15 @@
 ---
-license: mit
-license_link: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE
 language:
 - en
 pipeline_tag: text-generation
 tags:
 - chat
 - bitnet
 - text-generation
 - large-language-model
-library_name: transformers
 ---
 # BitNet b1.58 2B4T - Scaling Native 1-bit LLM
@@ -22,6 +22,10 @@ Trained on a corpus of 4 trillion tokens, this model demonstrates that native 1-
 ➡️ **Official Inference Code:** [microsoft/BitNet (bitnet.cpp)](https://github.com/microsoft/BitNet)
 ## Model Variants
 Several versions of the model weights are available on Hugging Face:
@@ -98,7 +102,8 @@ chat_input = tokenizer(prompt, return_tensors="pt").to(model.device)
 # Generate response
 chat_outputs = model.generate(**chat_input, max_new_tokens=50)
 response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
-print("\nAssistant Response:", response)
 ```
 ## How to Use (with `bitnet.cpp`)
@@ -141,4 +146,4 @@ BitNet b1.58 2B4T was evaluated against leading open-weight full-precision LLMs
 The model weights and code are released under the [MIT License](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE).
 ## Disclaimer
-This model is intended for research and development purposes. While efforts have been made to align it using SFT and DPO, it may still produce outputs that are unexpected, biased, or inaccurate. Please use responsibly.

 ---
 language:
 - en
+library_name: transformers
+license: mit
+license_link: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE
 pipeline_tag: text-generation
 tags:
 - chat
 - bitnet
 - text-generation
 - large-language-model
 ---
 # BitNet b1.58 2B4T - Scaling Native 1-bit LLM
 ➡️ **Official Inference Code:** [microsoft/BitNet (bitnet.cpp)](https://github.com/microsoft/BitNet)
+# Paper abstract
+The abstract of the paper is the following:
 ## Model Variants
 Several versions of the model weights are available on Hugging Face:
 # Generate response
 chat_outputs = model.generate(**chat_input, max_new_tokens=50)
 response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True) # Decode only the response part
+print("
+Assistant Response:", response)
 ```
 ## How to Use (with `bitnet.cpp`)
 The model weights and code are released under the [MIT License](https://huggingface.co/microsoft/bitnet-b1.58-2B-4T/blob/main/LICENSE).
 ## Disclaimer
+This model is intended for research and development purposes. While efforts have been made to align it using SFT and DPO, it may still produce outputs that are unexpected, biased, or inaccurate. Please use responsibly.