aozoranemo292m
Japanese mini LLM (~292M params). SentencePiece 16k.
- Parameters: 292,478,784
- Tokenizer: SentencePiece 16k
- Best checkpoint (logs):
checkpoint-100(継続学習は step=600 まで)
Training data (references)
- Aozora Bunko (clean): https://huggingface.co/datasets/globis-university/aozorabunko-clean
- Nemotron Personas (Japan): https://huggingface.co/blog/nvidia/nemotron-personas-japan?linkId=100000383918769 ※ 各データのライセンスに従ってご利用ください。
Quick start
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
repo = "hillfieldhonest/aozoranemo292m"
tok = AutoTokenizer.from_pretrained(repo, use_fast=False)
if tok.pad_token is None: tok.pad_token = tok.eos_token
dtype = (torch.bfloat16 if torch.cuda.is_available() and torch.cuda.is_bf16_supported()
else torch.float16 if torch.cuda.is_available() else torch.float32)
model = AutoModelForCausalLM.from_pretrained(repo, dtype=dtype, device_map="auto").eval()
out = model.generate(**tok("2文以内で自己紹介して。", return_tensors="pt").to(model.device),
max_new_tokens=96, do_sample=True, temperature=0.7, top_p=0.9)
print(tok.decode(out[0], skip_special_tokens=True))
License
- Weights: Apache-2.0(暫定。必要に応じて調整)
- Data: 上記参照先の規定に準拠
- Downloads last month
- 3