English-Japanese Transformer (65M Parameters)
A high-capacity Transformer model for English โ Japanese translation, trained on 1 Million sentence pairs from the Sampuran dataset. This model balances high capacity ($pprox 65$ Million parameters) with aggressive regularization (Dropout $\mathbf0.45$) to ensure generalization across a large dataset.
Performance
- Model Size: 65,324,816 parameters
- Training Data: 1 Million pairs
- Average Character BLEU Score (Test Set): 0.0836
- Best Validation Loss: 0.9559
- Final Training Epochs: 4 (Stopped by EarlyStopping)
Usage
The model can be loaded and used for inference with the included tokenizers.
import keras
from huggingface_hub import hf_hub_download
# Download model file
model_path = hf_hub_download(repo_id="RinKana/eng-jpn-transformer-nmt-efficient-63M", filename="transformer_model.keras")
model = keras.models.load_model(model_path)
- Downloads last month
- 52