mpasila
/

shisa-v2-JP-EN-Translator-v0.1-12B

Text Generation

text-generation-inference

Model card Files Files and versions Community

This only contains around 191 examples. This is just quick test will release the full around 1k examples soon.

I've done a quick cleaning of the data manually using Notepad++. There may still be broken stuff or other problems.

Uses ChatML, recommended system prompt: You are an AI assistant that translates Japanese to English accurately.

Uses NilanE/ParallelFiction-Ja_En-100k for the data.

LoRA: mpasila/shisa-v2-JP-EN-Translator-v0.1-LoRA-12B

Uses the usual 128 rank and 32 alpha. Trained on 16384 context window in QLoRA.

Token Count Statistics:

Total conversations: 191
Total tokens: 918486
Average tokens per conversation: 4808.83
Median tokens per conversation: 4187.0
Maximum tokens in a conversation: 13431
Minimum tokens in a conversation: 512

Token Distribution by Role:

System messages: 2483 tokens (0.27%)
Human messages: 494038 tokens (53.79%)
Assistant messages: 421965 tokens (45.94%)

Token Count Distribution:

0-512: 0 conversations (0.00%)
513-1024: 4 conversations (2.09%)
1025-2048: 10 conversations (5.24%)
2049-4096: 77 conversations (40.31%)
4097-8192: 83 conversations (43.46%)
8193-16384: 17 conversations (8.90%)
16385+: 0 conversations (0.00%)

Uploaded shisa-v2-JP-EN-Translator-v0.1-12B model

Developed by: mpasila
License: apache-2.0
Finetuned from model : shisa-ai/shisa-v2-mistral-nemo-12b

This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month: 4

Safetensors

Model size

12.2B params

Tensor type

BF16

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mpasila/shisa-v2-JP-EN-Translator-v0.1-12B

Base model

mistralai/Mistral-Nemo-Base-2407

Finetuned

mistralai/Mistral-Nemo-Instruct-2407

Finetuned

shisa-ai/shisa-v2-mistral-nemo-12b

Finetuned

(2)

this model

Datasets used to train mpasila/shisa-v2-JP-EN-Translator-v0.1-12B