KIT IWSLT25 Machine Translation Model

Adapted TowerInstruct 7B v0.2 for English-German translations. We filter the IWSLT data using quality estimation models and train on high quality data optimizing for the specific language pair. We find it to be better than the base model especially for speech domain.

Model Usage

The usage is same to the base model. However, we only tried for English-German translation and do not know the performance of the model on other languages and translation tasks.

Model Loading

model_id = "Unbabel/TowerInstruct-7B-v0.2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.padding_side="left"
padding="longest"
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
model.load_adapter("skoneru/iwslt_mt_ende")

Prompt Format

<|im_start|>user\nTranslate the sentence from English into German.
English:
{src_sentence}
German:<|im_end|>\n<|im_start|>assistant
{llm to generate}

Model Inference

After loading the model and the tokenizer, you can simply use the model with the prompt format as shown below:

src_sent = "Welcome to the first lecture"
prefix = "<|im_start|>user\nTranslate the sentence from English into German.\nEnglish: "
suffix = "\nGerman:<|im_end|>\n<|im_start|>assistant\n"
prompt = [prefix + src_sent + suffix]
inputs = tokenizer(prompt, return_tensors="pt", padding=True, add_special_tokens=False).to(model.device)
num_beams=5

output = model.generate(**inputs, num_beams=num_beams, max_new_tokens=256, return_dict_in_generate=True, early_stopping=True, do_sample=False)
hyps = tokenizer.batch_decode(output.sequences[:,inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(hyps)

📖 Citation

If you use this model in your research, please cite:

@inproceedings{koneru2025kit,
  title={KIT's Offline Speech Translation and Instruction Following Submission for IWSLT 2025},
  author={Koneru, Sai and Z{\"u}fle, Maike and Nguyen, Thai-Binh and Akti, Seymanur and Niehues, Jan and Waibel, Alexander},
  journal={arXiv preprint arXiv:2505.13036},
  year={2025},
  url={https://arxiv.org/abs/2505.13036}
}

skoneru
/

iwslt_mt_ende

KIT IWSLT25 Machine Translation Model

Model Usage

Model Loading

Prompt Format

Model Inference

📖 Citation

Model tree for skoneru/iwslt_mt_ende

Collection including skoneru/iwslt_mt_ende

KIT IWSLT25 Offline

Evaluation results