English-Japanese Transformer (65M Parameters)

A high-capacity Transformer model for English โ†’ Japanese translation, trained on 1 Million sentence pairs from the Sampuran dataset. This model balances high capacity ($pprox 65$ Million parameters) with aggressive regularization (Dropout $\mathbf0.45$) to ensure generalization across a large dataset.

Performance

  • Model Size: 65,324,816 parameters
  • Training Data: 1 Million pairs
  • Average Character BLEU Score (Test Set): 0.0836
  • Best Validation Loss: 0.9559
  • Final Training Epochs: 4 (Stopped by EarlyStopping)

Usage

The model can be loaded and used for inference with the included tokenizers.

import keras
from huggingface_hub import hf_hub_download

# Download model file
model_path = hf_hub_download(repo_id="RinKana/eng-jpn-transformer-nmt-efficient-63M", filename="transformer_model.keras")
model = keras.models.load_model(model_path)
Downloads last month
52
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support