PrismNLI-0.4B
PrismNLI-0.4B is a compact yet powerful model, purpose-built for natural language inference (NLI) and zero-shot classification. Despite its small size, it delivers state-of-the-art performance on 8 NLI benchmarks, making it a go-to solution for high-accuracy, low-latency applications.
PrismNLI-0.4B is fine-tuned from deberta-v3-large on our high-quality dataset PrismNLI, curated specifically to improve generalization of the trained model. For further details, please refer to our paper.
This version of the model goes beyond the original from our paper, to produce a single, robust NLI model ready for off-the-shelf deployment. The enhancement includes:
- Instead of starting from scratch, we start from deberta-v3-large-zeroshot-v2.0, a checkpoint of deberta-v3-lage trained on diverse classification data.
- Following prior works on entailment models, we reformulate the traditional 3-way NLI classification—
entailment
,neutral
, andcontradiction
—into a binary setup:entailment
vs.not-entailment
. This simplification helps the model to better act as a universal classifier by simply asking: Is this hypothesis true, given the premise?
Model | Average | HANS | WNLI | ANLI-r1 | ANLI-r2 | ANLI-r3 | Diagnostics | BigBench | Control |
---|---|---|---|---|---|---|---|---|---|
deberta-v3-large-zeroshot-v2.0 | 79.47 | 81.28 | 70.68 | 86.40 | 77.60 | 77.50 | 83.59 | 87.03 | 71.68 |
modernBERT-large-zeroshot-v2.0 | 74.78 | 80.30 | 66.00 | 81.20 | 71.50 | 71.67 | 82.05 | 73.18 | 72.30 |
deberta-v3-large-mfalw | 80.62 | 81.10 | 74.08 | 86.30 | 79.90 | 78.33 | 85.22 | 85.61 | 74.40 |
PrismNLI-0.4B | 82.88 | 90.68 | 72.95 | 87.70 | 78.80 | 79.58 | 86.22 | 90.59 | 76.52 |
Training Data
The model has been fine-tuned on 515K NLI datapoints from PrismNLI, a synthetic dataset to improve generalization of NLI models. The dataset has been generated by Qwen2.5-72B-Instruct via our algorithm, Prismatic Synthesis that scales synthetic data while improving the diversity of generated samples.
Model Usage
The model can be used as a standard NLI (entailment detection) classifier. Label 0
denotes entailment
, and Label l
denotes not-entailment
.
Beyond NLI, the model can serve as a zero-shot classifier:
from transformers import pipeline
text = "It was baked in a wood-fired oven and topped with San Marzano tomatoes and buffalo mozzarella."
hypothesis_template = "This text is about {}"
classes_verbalized = ['pizza', 'pasta', 'salad', 'sushi']
zeroshot_classifier = pipeline("zero-shot-classification", model="Jaehun/PrismNLI-0.4B")
output = zeroshot_classifier(text, classes_verbalized, hypothesis_template=hypothesis_template, multi_label=False)
The output will be like:
{
'sequence': 'It was baked in a wood-fired oven and topped with San Marzano tomatoes and buffalo mozzarella.',
'labels': ['pizza', 'pasta', 'salad', 'sushi'],
'scores': [0.9982, 0.0014, 0.0002, 0.0002],
}
Citation
If you find this model useful, please consider citing us!
@misc{prismatic-synthesis,
title={Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning},
author={Jaehun Jung and Seungju Han and Ximing Lu and Skyler Hallinan and David Acuna and Shrimai Prabhumoye and Mostafa Patwary and Mohammad Shoeybi and Bryan Catanzaro and Yejin Choi},
year={2025},
eprint={2505.20161},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2505.20161},
}
License/Terms of Use:
Governing Terms: This model is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0) available at https://creativecommons.org/licenses/by/4.0/legalcode.
This model is trained with synthetic data generated from Qwen2.5-72B-Instruct. If this dataset is used to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, such AI model may be subject to redistribution and use requirements in the Qwen License Agreement.
- Downloads last month
- 19
Model tree for Jaehun/PrismNLI-0.4B
Base model
microsoft/deberta-v3-large