QuarkML
/

TNMT-en-hi-experiment

encoder-decoder

english-to-hindi

Model card Files Files and versions

TNMT-en-hi-experiment / README.md

Sidharthan's picture

Update README.md

b6f8a6c verified 4 months ago

|

history blame contribute delete

2.71 kB

	---
	license: mit
	tags:
	- translation
	- pytorch
	- encoder-decoder
	- transformer
	- english-to-hindi
	- nmt
	library_name: pytorch
	language:
	- en
	- hi
	---

	# TransformerNMT: English-to-Hindi Experimental Transformer Model

	This repository contains a Transformer Encoder-Decoder model implemented from scratch in PyTorch for English-to-Hindi neural machine translation. The model and all training, preprocessing, and inference scripts are custom and do not use Hugging Face Transformers, but follow the original "Attention is All You Need" architecture.

	## Model Details

	- Architecture: Transformer Encoder-Decoder (Vaswani et al., 2017)
	- Framework: PyTorch
	- Languages: English (source) → Hindi (target)
	- Vocabulary: 32,000 BPE tokens per language (trained with `tokenizers`)
	- Training Data: Parallel English-Hindi corpus (see repo for data details)
	- Intended Use: Research, experimentation, and educational purposes

	## Training

	- Trained from scratch using the scripts in this repository.
	- Supports distributed and mixed-precision training.
	- Checkpoints and tokenizer files are provided in the `models/` and `Data/bi_tokenizers_32k/` directories.

	## Intended Uses & Limitations

	- Intended for: Experimentation, research, and demonstration of custom Transformer implementations.
	- Not intended for: Production use or high-stakes applications.
	- Limitations: May not achieve state-of-the-art translation quality. Use with caution for real-world tasks.

	## Example Inference

	Below is a simple inference script to translate English text to Hindi using the trained model and tokenizer:

	```python
	import torch
	from tokenizer import BilingualTokenizer as Tokenizer
	from model import Transformer, TransformerConfig
	from translator import TranslationInference

	# 1. Load config and checkpoint
	config = TransformerConfig(shared_embeddings=True)
	checkpoint = torch.load('models/TNMT_v1_Beta_single.pt', map_location='cpu')

	# 2. Build model and load weights
	model = Transformer(config)
	model.load_state_dict(checkpoint['model_state_dict'])
	model = model.to('cpu')

	# 3. Load tokenizer
	tokenizer = Tokenizer(vocab_size=32000)
	tokenizer_loaded = tokenizer.load_tokenizers('bi_tokenizers_32k')

	# 4. Create inference helper
	translator = TranslationInference(
	model=model,
	tokenizer=tokenizer_loaded,
	device='cpu'
	)

	# 5. Translate
	source_text = "This is a test sentence."
	translated_text = translator.translate_text(source_text)
	print("Translated text:", translated_text)
	```

	## Citation

	If you use this code or model, please cite:

	> Vaswani et al., "Attention is All You Need", NeurIPS 2017.

	---

	Author: QuarkML
	License: MIT