SalamandraTA-2B-academic Model Card

This repository contains the model SalamandraTA-2B-academic, which is a Machine Translation fine-tunning of the Salamandra2B-Instruct. This model has been obtained following the procedures shown in ACADATA: Parallel Dataset of Academic Data for Machine Translation.

DISCLAIMER: This version of Salamandra is tailored exclusively for translation tasks. Even if the Machine Translation version has been obtained after fine-tunning an instructed version the chat capabilities have not been tested. For this we refer to the used instructed version.

Model Details

Architecture


Total Parameters	2,253,490,176
Embedding Parameters	524,288,000
Layers	24
Hidden size	2,048
Attention heads	16
Context length	8,192
Vocabulary size	256,000
Precision	bfloat16
Embedding type	RoPE
Activation Function	SwiGLU
Layer normalization	RMS Norm
Flash attention	✅
Grouped Query Attention	❌
Num. query groups	N/A

Intended Use

Direct Use

The model is intended for both research and commercial use in any of the languages included in the training data for general machine translation tasks.

Out-of-scope Use

The model is not intended for malicious activities, such as harming others or violating human rights. Any downstream application must comply with current laws and regulations. Irresponsible usage in production environments without proper risk assessment and mitigation is also discouraged.

Hardware and Software

Training Framework

SalamandraTA-2B-academic was instructed with FastChat.

Compute Infrastructure

All models were trained on MareNostrum 5, a pre-exascale EuroHPC supercomputer hosted and operated by Barcelona Supercomputing Center.

The accelerated partition is composed of 1,120 nodes with the following specifications:

4x Nvidia Hopper GPUs with 64GB HBM2 memory
2x Intel Sapphire Rapids 8460Y+ at 2.3Ghz and 32c each (64 cores)
4x NDR200 (BW per node 800Gb/s)
512 GB of Main memory (DDR5)
460GB on NVMe storage

How to use

SalamandraTA-2B-academic was fine-tuned using ACAD-Train dataset which focuses on pairs involving English, Iberian Peninsula languages, and several Central European languages, namely: Asturian (ast), Catalan (ca), German (de), Greek (el), Spanish (es), English (en), Basque (eu), French (fr), Galician (gl), Italian (it), Dutch (nl) and Portuguese (pt). The dataset includes 48 unique language pairs. Since each pair is used for translation in both directions (e.g., English to Spanish and Spanish to English), this results in the 96 total supported directions. The most frequent language pairs, accounting for 96.5% of the dataset, are:

English - Spanish (en-es)
English - French (en-fr)
English - Catalan (en-ca)
Catalan - Spanish (ca-es)
Spanish - French (es-fr)
English - Portuguese (en-pt)

A comprehensive list of all language pairs included in the ACAD-Train dataset.

The instruction-following model uses the commonly adopted ChatML template:

<|im_start|>system
{SYSTEM PROMPT}<|im_end|>
<|im_start|>user
{USER PROMPT}<|im_end|>
<|im_start|>assistant
{MODEL RESPONSE}<|im_end|>
<|im_start|>user
[...]

The easiest way to apply it is by using the tokenizer's built-in functions, as shown in the following snippet.

from datetime import datetime
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

model_id = "LangTech-MT/salamandraTA-2B-academic"

# Input parameters
source = 'English'
target = 'Spanish'
sentence = "With the purpose of analyzing women’s perceptions and classifying their modes of understanding a positive human papillomavirus (HPV+) test, we conducted 38 in‑depth interviews with women who had received an HPV diagnosis (normal and abnormal Pap smear), screened in Jujuy’s public health system in 2016. A typology based on women’s understandings of the result was developed: 1) understanding; 2) lack of understanding; a) underestimation; b) overestimation; c) confusion. The interviewees who experienced confusion over the results reported contradictory perceptions in relation to a positive HPV test and its severity; those who underestimated it tended to mention the absence of symptoms and expressed little concern over the result; while those who overestimated it considered themselves sick and described concern, narrating a biographical disruption and physical pain. These findings confirm the need to improve the delivery of results and the provision of information in order to decrease psychosocial impact and increase follow‑up adherence in HPV‑positive women."
 
text = f"Translate the following text from {source} into {target}.\n{source}: {sentence} \n{target}:"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.bfloat16
  )

# Construct prompt using chat template
message = [ { "role": "user", "content": text } ]
date_string = datetime.today().strftime('%Y-%m-%d')

prompt = tokenizer.apply_chat_template(
    message,
    tokenize=False,
    add_generation_prompt=True,
    date_string=date_string
)

inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
input_length = inputs.shape[1]

# Generate output
outputs = model.generate(
    input_ids=inputs.to(model.device), 
    max_new_tokens=400,
    early_stopping=True,
    num_beams=5
)

# Decode and print output
print(tokenizer.decode(outputs[0, input_length:], skip_special_tokens=True))
# Con el propósito de analizar las percepciones de las mujeres y clasificar sus modos de comprensión de un resultado positivo de virus del papiloma humano (VPH+), en 2016 realizamos 38 entrevistas en profundidad a mujeres con diagnóstico de VPH (citología normal y anormal) detectado en el sistema público de salud de Jujuy. Se elaboró una tipología basada en la comprensión del resultado por parte de las mujeres: 1) comprensión; 2) falta de comprensión; a) subestimación; b) sobreestimación; c) confusión. Las entrevistadas que experimentaron confusión informaron percepciones contradictorias sobre el VPH+ y su gravedad; quienes lo subestimaron tendían a mencionar la ausencia de síntomas y mostraron poca preocupación; mientras que aquellas que lo sobreestimaron se consideraban enfermas, describían preocupación, narrando una ruptura biográfica y dolor físico. Estos hallazgos confirman la necesidad de mejorar la entrega de resultados y la provisión de información para disminuir el impacto psicosocial y aumentar la adherencia al seguimiento en mujeres con VPH positivo.

Using this template, each turn is preceded by a <|im_start|> delimiter and the role of the entity (either user, for content supplied by the user, or assistant for LLM responses), and finished with the <|im_end|> token.

Machine Translation Prompt

The following prompt template is recommended, since it is the one used during training:

Translate the following text from {source} into {target}.
{source}: {source sentence}
{target}:

Show an example

source = 'English'
target = 'Spanish'
source_sentence = "With the purpose of analyzing women’s perceptions and classifying their modes of understanding a positive human papillomavirus (HPV+) test, we conducted 38 in‑depth interviews with women who had received an HPV diagnosis (normal and abnormal Pap smear), screened in Jujuy’s public health system in 2016. A typology based on women’s understandings of the result was developed: 1) understanding; 2) lack of understanding; a) underestimation; b) overestimation; c) confusion. The interviewees who experienced confusion over the results reported contradictory perceptions in relation to a positive HPV test and its severity; those who underestimated it tended to mention the absence of symptoms and expressed little concern over the result; while those who overestimated it considered themselves sick and described concern, narrating a biographical disruption and physical pain. These findings confirm the need to improve the delivery of results and the provision of information in order to decrease psychosocial impact and increase follow‑up adherence in HPV‑positive women."

text = f"Translate the following text from {source} into {target}.\n{source}: {source_sentence} \n{target}:"
# Con el propósito de analizar las percepciones de las mujeres y clasificar sus modos de comprensión de un resultado positivo de virus del papiloma humano (VPH+), en 2016 realizamos 38 entrevistas en profundidad a mujeres con diagnóstico de VPH (citología normal y anormal) detectado en el sistema público de salud de Jujuy. Se elaboró una tipología basada en la comprensión del resultado por parte de las mujeres: 1) comprensión; 2) falta de comprensión; a) subestimación; b) sobreestimación; c) confusión. Las entrevistadas que experimentaron confusión informaron percepciones contradictorias sobre el VPH+ y su gravedad; quienes lo subestimaron tendían a mencionar la ausencia de síntomas y mostraron poca preocupación; mientras que aquellas que lo sobreestimaron se consideraban enfermas, describían preocupación, narrando una ruptura biográfica y dolor físico. Estos hallazgos confirman la necesidad de mejorar la entrega de resultados y la provisión de información para disminuir el impacto psicosocial y aumentar la adherencia al seguimiento en mujeres con VPH positivo.

Instruction Tuning Data

The corpus used for the instruction tuning is ACAData. For more details about the corpus construction, you can refer to the [Paper](*add link to paper).

Evaluation

Aggregated results for the xx ↔ en and xx ↔ es translation directions in ACAD-Bench dataset. Baselines are grouped into large-scale proprietary general models, medium- to small-sized open-weights models and dedicated MMNMT models. For every metric the top-scoring system is shown in bold. For a more detailed evaluation discussion, please refer to the paper.

xx → en

Direction	Model	d-BLEU	BP	Blonde	Comet	Comet-Kiwi
xx → en	GPT-mini	46.03	1.00	0.60	0.84	0.77
	GPT-nano	41.30	0.97	0.55	0.84	0.78
	Gemini-2	48.65	1.00	0.61	0.84	0.77
	Gemini-2.5	45.10	0.98	0.58	0.84	0.77
	Llama-3-8B	43.12	0.99	0.56	0.83	0.76
	Gemma-3-27B	46.37	0.98	0.59	0.84	0.77
	MADLAD-7B	38.69	0.86	0.51	0.81	0.77
	Salamandra-2B	37.09	0.92	0.52	0.82	0.75
	+ ACADTRAIN	48.45	1.00	0.61	0.83	0.76
	Salamandra-7B	45.87	0.99	0.59	0.83	0.76
	+ ACADTRAIN	50.07	1.00	0.62	0.84	0.76

en → xx

Direction	Model	d-BLEU	BP	Blonde	Comet	Comet-Kiwi
en → xx	GPT-mini	45.01	0.99	-	0.86	0.82
	GPT-nano	43.78	1.00	-	0.86	0.82
	Gemini-2	48.00	0.99	-	0.87	0.82
	Gemini-2.5	47.75	0.99	-	0.87	0.82
	Llama-3-8B	39.87	0.99	-	0.85	0.81
	Gemma-3-27B	46.29	0.99	-	0.86	0.82
	MADLAD-7B	36.08	0.82	-	0.83	0.80
	Salamandra-2B	32.91	0.90	-	0.83	0.78
	+ ACADTRAIN	46.86	0.98	-	0.86	0.81
	Salamandra-7B	42.55	0.98	-	0.86	0.81
	+ ACADTRAIN	49.20	0.98	-	0.86	0.81

xx → es

Direction	Model	d-BLEU	BP	Blonde	Comet	Comet-Kiwi
xx → es	GPT-mini	60.60	0.98	-	0.86	0.82
	GPT-nano	57.88	0.99	-	0.86	0.82
	Gemini-2	62.02	0.99	-	0.86	0.82
	Gemini-2.5	61.43	0.98	-	0.87	0.82
	Llama-3-8B	55.4	0.98	-	0.86	0.81
	Gemma-3-27B	60.71	0.98	-	0.86	0.82
	MADLAD-7B	43.44	0.76	-	0.83	0.81
	Salamandra-2B	50.09	0.92	-	0.85	0.80
	+ ACADTRAIN	61.97	0.98	-	0.86	0.82
	Salamandra-7B	57.55	0.98	-	0.86	0.82
	+ ACADTRAIN	63.60	0.98	-	0.86	0.82

es → xx

Direction	Model	d-BLEU	BP	Blonde	Comet	Comet-Kiwi
es → xx	GPT-mini	54.19	0.99	-	0.86	0.81
	GPT-nano	51.95	0.99	-	0.86	0.81
	Gemini-2	60.28	0.99	-	0.86	0.81
	Gemini-2.5	57.61	0.99	-	0.86	0.81
	Llama-3-8B	52.12	0.99	-	0.85	0.80
	Gemma-3-27B	57.31	0.99	-	0.86	0.81
	MADLAD-7B	40.13	0.79	-	0.83	0.81
	Salamandra-2B	47.84	0.94	-	0.84	0.80
	+ ACADTRAIN	60.09	0.99	-	0.86	0.81
	Salamandra-7B	55.65	0.98	-	0.86	0.80
	+ ACADTRAIN	61.61	0.99	-	0.86	0.81

Ethical Considerations and Limitations

Detailed information on the work done to examine the presence of unwanted social and cognitive biases in the base model can be found at Salamandra-2B model card. No specific analysis has yet been carried out in order to evaluate potential biases or limitations in translation accuracy across different languages, dialects, or domains. However, we recognize the importance of identifying and addressing any harmful stereotypes, cultural inaccuracies, or systematic performance discrepancies that may arise in Machine Translation. As such, we plan to continue performing more analyses as we implement the necessary metrics and methods within our evaluation framework MT-Lens. Note that the model has only undergone preliminary instruction tuning. We urge developers to consider potential limitations and conduct safety testing and tuning tailored to their specific applications.

Additional information

Author

The Language Technologies Unit from Barcelona Supercomputing Center.

Contact

For further information, please send an email to [email protected].

Copyright

Funding

This work is funded by the Ministerio para la Transformación Digital y de la Función Pública - Funded by EU – NextGenerationEU within the framework of the project Modelos del Lenguaje.

This work has been promoted and financed by the Government of Catalonia through the Aina project.

This work is funded by the Ministerio para la Transformación Digital y de la Función Pública - Funded by EU – NextGenerationEU within the framework of the [project ILENIA] (https://proyectoilenia.es/) with reference 2022/TL22/00215337.

Disclaimer

Be aware that the model may contain biases or other unintended distortions. When third parties deploy systems or provide services based on this model, or use the model themselves, they bear the responsibility for mitigating any associated risks and ensuring compliance with applicable regulations, including those governing the use of Artificial Intelligence.

The Barcelona Supercomputing Center, as the owner and creator of the model, shall not be held liable for any outcomes resulting from third-party use.

Citation

@misc{lacunza2025acadataparalleldatasetacademic,
      title={ACADATA: Parallel Dataset of Academic Data for Machine Translation}, 
      author={Iñaki Lacunza and Javier Garcia Gilabert and Francesca De Luca Fornaciari and Javier Aula-Blasco and Aitor Gonzalez-Agirre and Maite Melero and Marta Villegas},
      year={2025},
      eprint={2510.12621},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.12621}, 
}

License

Apache License, Version 2.0

Downloads last month: 28

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for BSC-LT/salamandraTA-2B-academic

Base model

BSC-LT/salamandra-2b

Finetuned

BSC-LT/salamandra-2b-instruct

Finetuned

(3)

this model

Quantizations

2 models