KIT IWSLT25 Offline
Collection
Models used in KIT submission to iwslt 25 offline task
β’
4 items
β’
Updated
Adapted TowerInstruct 7B v0.2 for English-German translations. We filter the IWSLT data using quality estimation models and train on high quality data optimizing for the specific language pair. We find it to be better than the base model especially for speech domain.
The usage is same to the base model. However, we only tried for English-German translation and do not know the performance of the model on other languages and translation tasks.
model_id = "Unbabel/TowerInstruct-7B-v0.2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.padding_side="left"
padding="longest"
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
model.load_adapter("skoneru/iwslt_mt_ende")
<|im_start|>user\nTranslate the sentence from English into German.
English:
{src_sentence}
German:<|im_end|>\n<|im_start|>assistant
{llm to generate}
After loading the model and the tokenizer, you can simply use the model with the prompt format as shown below:
src_sent = "Welcome to the first lecture"
prefix = "<|im_start|>user\nTranslate the sentence from English into German.\nEnglish: "
suffix = "\nGerman:<|im_end|>\n<|im_start|>assistant\n"
prompt = [prefix + src_sent + suffix]
inputs = tokenizer(prompt, return_tensors="pt", padding=True, add_special_tokens=False).to(model.device)
num_beams=5
output = model.generate(**inputs, num_beams=num_beams, max_new_tokens=256, return_dict_in_generate=True, early_stopping=True, do_sample=False)
hyps = tokenizer.batch_decode(output.sequences[:,inputs.input_ids.shape[1]:], skip_special_tokens=True)
print(hyps)
If you use this model in your research, please cite:
@inproceedings{koneru2025kit,
title={KIT's Offline Speech Translation and Instruction Following Submission for IWSLT 2025},
author={Koneru, Sai and Z{\"u}fle, Maike and Nguyen, Thai-Binh and Akti, Seymanur and Niehues, Jan and Waibel, Alexander},
journal={arXiv preprint arXiv:2505.13036},
year={2025},
url={https://arxiv.org/abs/2505.13036}
}
Base model
Unbabel/TowerInstruct-7B-v0.2