Mantra-14B

Mantra-14B is a 14.7B parameter instruction-tuned bilingual large language model for both Hindi and English, trained on a mixed language dataset.

  • ~0.7 % better performance on English Tasks compared to the original (average benchmark scores)
  • ~2.8 % better performance on Hindi Tasks compared to the original (average benchmark scores)
  • ~4.4 % better performance on tougher english benchmarks (open-llm-leaderboard evals)
  • ~8.5 % less emissions than the original (as reported on benchmark evaluations like open-llm-leaderboard)
  • Less Biases due to ordering of choices while answering MCQs

Model Details:

  • Developed by: Traversaal.ai, 1-800-LLMs
  • Language(s) (NLP): Optimized for Hindi and English
  • License: Apache 2.0
  • Paper : TBA April 15

image/png

Prompt Formats

Task Input Format
Natural Language Inference "Text1 ### Text2 ### NLI ###"
Multiple Choice Questions "Question ### A) a, B) b,... ### MCQ ###"
Numeric Questions "Question ### NUMERIC ###"
Boolean Questions "Question ### BOOLEAN ###"
Questions seeking Long responses "Question ### LONG RESPONSE ###"
Short responses (few words) "Input ### DIRECT RESPONSE ###"
Coding "Input ### CODE ###"
Text Summarization "Input ### SUMMARIZE ###"
Paraphrasing/Rephrasing "Input ### PARAPHRASE ###"
Translation to specified language "Input ### TRANSLATION [lang] ###"
Text Simplification/ELI5 "Input ### SIMPLIFY ###"

The following prompt formats were used during training and are better suited for usage, however the model works well even without such formatting

Evaluation:

We evaluated our models on multiple well-known benchmarks to measure their effectiveness against other leading models, and the results are as follows:

Model ARC-C ARC-E BoolQ CMCQ MMLU Average* MMLU-Pro GPQA MuSR BBH MATH-Hard
AryaBhatta-GemmaUltra-8.5B 22.70 25.04 22.95 62.23 23.70 31.32 22.66 25.34 42.72 41.12 2.95
Airavata-7B 25.09 30.47 25.31 62.17 33.20 35.25 16.35 27.43 37.57 36.00 13.60
sarvam-1-2B 30.03 33.25 62.17 42.80 27.90 39.23 - - - - -
Nemotron-4-Mini-Hindi-Instruct 55.80 71.63 62.11 68.10 43.20 60.17 25.95 30.87 41.53 40.11 2.04
Llama-3-Nanda-10B-Chat 65.36 80.64 82.29 67.60 50.61 69.30 31.57 30.12 43.52 49.38 5.59
Krutrim-2-12b-instruct 67.32 81.10 84.74 76.30 56.10 73.11 - - - - -
aya-expanse-8b 74.06 87.08 86.45 83.30 56.89 77.56 30.04 30.29 37.17 49.42 7.02
aya-expanse-32B 85.41 95.08 90.43 89.80 69.71 86.08 41.30 32.55 38.62 56.29 13.37
Mantra-14B 97.39 92.24 87.65 87.40 75.59 88.05 52.39 39.77 49.07 66.97 23.11

Table 1: Metrics (.2f) of our models and other LLMs over several English benchmarks

Model ARC-C ARC-E BoolQ CMCQ MMLU Average
AryaBhatta-GemmaUltra-8.5B 22.70 25.08 22.95 62.17 23.80 31.34
Airavata-7B 22.87 25.13 23.28 62.17 33.20 33.33
sarvam-1-2B 32.76 35.06 62.16 47.10 24.22 40.26
Llama-3-Nanda-10B-Chat 45.99 60.56 71.96 54.70 36.35 53.91
Nemotron-4-Mini-Hindi-4B-Instruct 50.68 63.72 68.74 51.30 37.18 54.32
Krutrim-2-12b-instruct 56.83 70.66 78.86 64.10 46.51 63.39
aya-expanse-8b 57.42 72.90 80.42 69.00 43.39 64.63
aya-expanse-32B 73.29 85.48 87.73 79.70 56.96 76.63
Mantra-14B 81.74 89.06 86.02 78.70 56.39 78.38

Table 2: Metrics (.2f) of our models and other LLMs over several Hindi benchmarks

Downloads last month
132
Safetensors
Model size
14.7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for large-traversaal/Mantra-14B

Quantizations
2 models

Dataset used to train large-traversaal/Mantra-14B

Spaces using large-traversaal/Mantra-14B 5

Collection including large-traversaal/Mantra-14B

Evaluation results