This only contains around 191 examples. This is just quick test will release the full around 1k examples soon.

I've done a quick cleaning of the data manually using Notepad++. There may still be broken stuff or other problems.

Uses ChatML, recommended system prompt: You are an AI assistant that translates Japanese to English accurately.

Uses NilanE/ParallelFiction-Ja_En-100k for the data.

LoRA: mpasila/shisa-v2-JP-EN-Translator-v0.1-LoRA-12B

Uses the usual 128 rank and 32 alpha. Trained on 16384 context window in QLoRA.

Token Count Statistics:

  • Total conversations: 191
  • Total tokens: 918486
  • Average tokens per conversation: 4808.83
  • Median tokens per conversation: 4187.0
  • Maximum tokens in a conversation: 13431
  • Minimum tokens in a conversation: 512

Token Distribution by Role:

  • System messages: 2483 tokens (0.27%)
  • Human messages: 494038 tokens (53.79%)
  • Assistant messages: 421965 tokens (45.94%)

Token Count Distribution:

  • 0-512: 0 conversations (0.00%)
  • 513-1024: 4 conversations (2.09%)
  • 1025-2048: 10 conversations (5.24%)
  • 2049-4096: 77 conversations (40.31%)
  • 4097-8192: 83 conversations (43.46%)
  • 8193-16384: 17 conversations (8.90%)
  • 16385+: 0 conversations (0.00%)

Uploaded shisa-v2-JP-EN-Translator-v0.1-12B model

  • Developed by: mpasila
  • License: apache-2.0
  • Finetuned from model : shisa-ai/shisa-v2-mistral-nemo-12b

This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
4
Safetensors
Model size
12.2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mpasila/shisa-v2-JP-EN-Translator-v0.1-12B

Datasets used to train mpasila/shisa-v2-JP-EN-Translator-v0.1-12B