aozoranemo292m

Japanese mini LLM (~292M params). SentencePiece 16k.

  • Parameters: 292,478,784
  • Tokenizer: SentencePiece 16k
  • Best checkpoint (logs): checkpoint-100(継続学習は step=600 まで)

Training data (references)

Quick start

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
repo = "hillfieldhonest/aozoranemo292m"
tok = AutoTokenizer.from_pretrained(repo, use_fast=False)
if tok.pad_token is None: tok.pad_token = tok.eos_token
dtype = (torch.bfloat16 if torch.cuda.is_available() and torch.cuda.is_bf16_supported()
         else torch.float16 if torch.cuda.is_available() else torch.float32)
model = AutoModelForCausalLM.from_pretrained(repo, dtype=dtype, device_map="auto").eval()
out = model.generate(**tok("2文以内で自己紹介して。", return_tensors="pt").to(model.device),
                     max_new_tokens=96, do_sample=True, temperature=0.7, top_p=0.9)
print(tok.decode(out[0], skip_special_tokens=True))

License

  • Weights: Apache-2.0(暫定。必要に応じて調整)
  • Data: 上記参照先の規定に準拠
Downloads last month
3
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support