kcoopermiller's picture
Update README.md
0ffd50d verified
---
license: apache-2.0
datasets:
- CohereForAI/aya_dataset
language:
- en
- ja
widget:
- text: "自然言語処理とは何か"
---
# llm-jp-1.3b-v1.0-aya
llm-jp's [llm-jp-1.3b-v1.0](https://huggingface.co/llm-jp/llm-jp-1.3b-v1.0) model fine-tuned on the Japanese examples from Cohere's [aya dataset](https://huggingface.co/datasets/CohereForAI/aya_dataset)
| Model | [llm-jp-eval AVG](https://wandb.ai/wandb-japan/llm-leaderboard/reports/Nejumi-LLM-Leaderboard-Evaluating-Japanese-Language-Proficiency--Vmlldzo2MzU3NzIy#deep-dive-into-llm-jp-eval) |
|-----------------------------------|---------|
| kcoopermiller/llm-jp-1.3b-v1.0-aya | **0.0698** |
| llm-jp/llm-jp-1.3b-v1.0 | 0.047 |
## How to use
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("kcoopermiller/llm-jp-1.3b-v1.0-aya")
model = AutoModelForCausalLM.from_pretrained("kcoopermiller/llm-jp-1.3b-v1.0-aya", device_map="auto")
text = "自然言語処理とは何か"
tokenized_input = tokenizer.encode(text, add_special_tokens=False, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(
tokenized_input,
max_new_tokens=20,
do_sample=True,
top_p=0.90,
temperature=0.7,
)[0]
print(tokenizer.decode(output))
```