openbmb
/

MiniCPM4.1-8B-AutoAWQ

Text Generation

4-bit precision

Model card Files Files and versions

guanwenyu1995 commited on 12 days ago

Commit

8212c3b

·

verified ·

1 Parent(s): 9f2756b

Create README.md

Files changed (1) hide show

README.md +64 -0

README.md ADDED Viewed

	@@ -0,0 +1,64 @@

+---
+license: apache-2.0
+language:
+- zh
+- en
+pipeline_tag: text-generation
+library_name: transformers
+---
+<div align="center">
+<img src="https://github.com/OpenBMB/MiniCPM/blob/main/assets/minicpm_logo.png?raw=true" width="500em" ></img>
+</div>
+## Usage
+### Prebuilt [AutoAWQ](https://github.com/casper-hansen/AutoAWQ.git)
+```bash
+pip install autoawq
+```
+### Inference with
+```python
+from awq import AutoAWQForCausalLM
+import torch
+from transformers import AutoTokenizer
+prompt = "北京有什么好玩的地方？"
+quant_path = "MiniCPM4.1-8B-AutoAWQ"
+messages = [{"role": "user", "content": prompt}]
+model = AutoAWQForCausalLM.from_quantized(
+    quant_path,
+    fuse_layers=False,
+    trust_remote_code=True
+)
+tokenizer = AutoTokenizer.from_pretrained(
+    quant_path,
+    trust_remote_code=True
+)
+device = next(model.model.parameters()).device
+# if enable_think
+# formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt = True, enable_thinking = True)
+# if disable_think
+formatted_prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt = True, enable_thinking = False)
+input_ids = tokenizer.encode(formatted_prompt, return_tensors='pt').to(device)
+outputs = model.generate(
+    input_ids,
+    max_new_tokens=1000,
+    do_sample=True
+)
+# if enable think
+# ans = [i.split("<|im_start|> assistant\n", 1)[1].strip() for i in tokenizer.batch_decode(outputs)]
+# if disable think
+ans = [i.split("<|im_start|> assistant\n<think>\n\n</think>", 1)[1].strip() for i in tokenizer.batch_decode(outputs)]
+```
+<p align="center">
+<a href="https://github.com/OpenBMB/MiniCPM/" target="_blank">GitHub Repo</a> |
+<a href="https://arxiv.org/abs/2506.07900" target="_blank">Technical Report</a> |
+<a href="https://mp.weixin.qq.com/s/KIhH2nCURBXuFXAtYRpuXg?poc_token=HBIsUWijxino8oJ5s6HcjcfXFRi0Xj2LJlxPYD9c">Join Us</a>
+</p>
+<p align="center">
+👋 Contact us in <a href="https://discord.gg/3cGQn9b3YM" target="_blank">Discord</a> and <a href="https://github.com/OpenBMB/MiniCPM/blob/main/assets/wechat.jpg" target="_blank">WeChat</a>
+</p>