Sidharthan's picture
Update README.md
b6f8a6c verified
---
license: mit
tags:
- translation
- pytorch
- encoder-decoder
- transformer
- english-to-hindi
- nmt
library_name: pytorch
language:
- en
- hi
---
# TransformerNMT: English-to-Hindi Experimental Transformer Model
This repository contains a **Transformer Encoder-Decoder** model implemented from scratch in PyTorch for English-to-Hindi neural machine translation. The model and all training, preprocessing, and inference scripts are custom and do **not** use Hugging Face Transformers, but follow the original "Attention is All You Need" architecture.
## Model Details
- **Architecture:** Transformer Encoder-Decoder (Vaswani et al., 2017)
- **Framework:** PyTorch
- **Languages:** English (source) → Hindi (target)
- **Vocabulary:** 32,000 BPE tokens per language (trained with `tokenizers`)
- **Training Data:** Parallel English-Hindi corpus (see repo for data details)
- **Intended Use:** Research, experimentation, and educational purposes
## Training
- Trained from scratch using the scripts in this repository.
- Supports distributed and mixed-precision training.
- Checkpoints and tokenizer files are provided in the `models/` and `Data/bi_tokenizers_32k/` directories.
## Intended Uses & Limitations
- **Intended for:** Experimentation, research, and demonstration of custom Transformer implementations.
- **Not intended for:** Production use or high-stakes applications.
- **Limitations:** May not achieve state-of-the-art translation quality. Use with caution for real-world tasks.
## Example Inference
Below is a simple inference script to translate English text to Hindi using the trained model and tokenizer:
```python
import torch
from tokenizer import BilingualTokenizer as Tokenizer
from model import Transformer, TransformerConfig
from translator import TranslationInference
# 1. Load config and checkpoint
config = TransformerConfig(shared_embeddings=True)
checkpoint = torch.load('models/TNMT_v1_Beta_single.pt', map_location='cpu')
# 2. Build model and load weights
model = Transformer(config)
model.load_state_dict(checkpoint['model_state_dict'])
model = model.to('cpu')
# 3. Load tokenizer
tokenizer = Tokenizer(vocab_size=32000)
tokenizer_loaded = tokenizer.load_tokenizers('bi_tokenizers_32k')
# 4. Create inference helper
translator = TranslationInference(
model=model,
tokenizer=tokenizer_loaded,
device='cpu'
)
# 5. Translate
source_text = "This is a test sentence."
translated_text = translator.translate_text(source_text)
print("Translated text:", translated_text)
```
## Citation
If you use this code or model, please cite:
> Vaswani et al., "Attention is All You Need", NeurIPS 2017.
---
**Author:** QuarkML
**License:** MIT