jaked97
/

llm-jp-3-13b-it-bs4-ac10-step251-fp8

Text Generation

text-generation-inference

compressed-tensors

Model card Files Files and versions

jaked97 commited on Dec 16, 2024

Commit

2e16c97

·

verified ·

1 Parent(s): 88d79be

update

Files changed (1) hide show

README.md +70 -0

README.md CHANGED Viewed

@@ -3,6 +3,76 @@ library_name: transformers
 tags: []
 ---
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->

 tags: []
 ---
+## 以下は推論用コードです。
+* 事前に以下をインストールしてください。
+  * pip install -q numpy==1.26.4
+  * pip install -q vllm==0.6.4
+```python
+from vllm import LLM, SamplingParams
+import torch
+import json
+dir = "."
+from datasets import load_dataset
+data_files = {"test": dir + "/elyza-tasks-100-TV_0.jsonl"}
+tasks = load_dataset("json", data_files=data_files, split="test")
+id = "llm-jp-3-13b-it-bs4-ac10-step251-fp8"
+from huggingface_hub import snapshot_download
+model_id = snapshot_download(repo_id="jaked97/" + id)
+prompts = [
+    f"""### instruction:
+あなたは親切なAIアシスタントです。
+### input:
+{input}
+### output:
+""" for input in tasks["input"]]
+llm = LLM(
+    model=model_id,
+    gpu_memory_utilization=0.95,
+    quantization="compressed-tensors",
+    trust_remote_code=True,
+    enforce_eager=True,
+)
+# 推論の実行
+outputs = llm.generate(
+    prompts,
+    sampling_params = SamplingParams(
+        temperature=0,
+        max_tokens=512,
+        repetition_penalty=1.2,
+        skip_special_tokens=True,
+        seed=97,
+    ),
+)
+# jsonlで保存
+with open(dir + f"/{id}_max512-vllm.jsonl", 'w', encoding='utf-8') as f:
+    for i in range(len(outputs)):
+        result = {
+            "task_id" : tasks[i]["task_id"],
+            "input" : tasks[i]["input"],
+            "output" : outputs[i].outputs[0].text
+        }
+        json.dump(result, f, ensure_ascii=False)
+        f.write('\n')
+```
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->