English-Japanese Transformer (65M Parameters)

A high-capacity Transformer model for English → Japanese translation, trained on 1 Million sentence pairs from the Sampuran dataset. This model balances high capacity ($pprox 65$ Million parameters) with aggressive regularization (Dropout $\mathbf0.45$) to ensure generalization across a large dataset.

Performance

Model Size: 65,324,816 parameters
Training Data: 1 Million pairs
Average Character BLEU Score (Test Set): 0.0836
Best Validation Loss: 0.9559
Final Training Epochs: 4 (Stopped by EarlyStopping)

Usage

The model can be loaded and used for inference with the included tokenizers.

import keras
from huggingface_hub import hf_hub_download

# Download model file
model_path = hf_hub_download(repo_id="RinKana/eng-jpn-transformer-nmt-efficient-63M", filename="transformer_model.keras")
model = keras.models.load_model(model_path)

Downloads last month: 52