|
--- |
|
license: mit |
|
--- |
|
**How to translate with this model** |
|
|
|
+ Install [Python 3.9](https://www.python.org/downloads/release/python-390/) + ctranslate 2 + subword-nmt |
|
```bash |
|
pip install ctranslate2~=3.20.0 |
|
``` |
|
```bash |
|
pip install subword-nmt |
|
``` |
|
+ tokenization with BPE: |
|
```bash |
|
subword-nmt apply-bpe -c gl-detok10k.code < input_file.txt > input_file_bpe.txt |
|
``` |
|
|
|
+ Translating an input_text using ct2_detok-gl-zh: |
|
```bash |
|
python3 trans_ct2.py ct2_detok-gl-zh input_file_bpe.txt >output_file_bpe.txt |
|
``` |
|
+ DeBPEar output txt: |
|
|
|
```bash |
|
cat out_test_bpe.txt | sed "s/@@ //g" > output_file.txt |
|
``` |
|
|
|
**Acknowledgments** |
|
|
|
Thanks to Tang Waying, Zheng Jie and Wang Tianjiao for helping prepare the parallel corpora. |