Zeroshot Classifiers

MoritzLaurer 's Collections

prompt-templates

leaderboards

other-useful

other-interesting

synthetic data

code generation

updated Jan 6

These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better.

Upvote

142

MoritzLaurer/ModernBERT-large-zeroshot-v2.0

Text Classification • 0.4B • Updated Jan 16 • 9.28k • 52

Note - Significantly faster and more memory efficient than DeBERTav3 - English-only, 8k context window - slightly less accurate than DeBERTav3-zeroshot-v2.0
MoritzLaurer/deberta-v3-large-zeroshot-v2.0

Zero-Shot Classification • 0.4B • Updated Apr 11, 2024 • 40.1k • 105

Note - Performance: most performant model - Size: 0.43 B parameters, 870 MB - Other: language English; context length max 512 tokens; can be a bit slower than RoBERTa models - Alternatives: same model trained on only commercially-friendly data: https://huggingface.co/MoritzLaurer/deberta-v3-large-zeroshot-v2.0-c longer context & multilingual: https://huggingface.co/MoritzLaurer/bge-m3-zeroshot-v2.0
MoritzLaurer/bge-m3-zeroshot-v2.0

Zero-Shot Classification • 0.6B • Updated Apr 22, 2024 • 46.2k • 52

Note - Performance: most performant multilingual model - Size: 0.57 B parameters, 1.14 GB - Other: 100+ languages; context length max 8192 tokens; based on bge-m3-retromae, which is based on XLM-RoBERTa - Alternatives: same model trained on only commercially-friendly data: https://huggingface.co/MoritzLaurer/bge-m3-zeroshot-v2.0-c Note: English-only models combined with machine translated text can perform better (https://github.com/UKPLab/EasyNMT )
MoritzLaurer/deberta-v3-base-zeroshot-v2.0

Zero-Shot Classification • 0.2B • Updated Apr 11, 2024 • 16.1k • 9

Note - Performance: most performant base-size model - Size: 0.18 B parameters, 369 MB - Other: language English; context length max 512 tokens; faster than RoBERTa-large/BGE-3 models, but slower than RoBERTa-base - Alternatives: same model trained on only commercially-friendly data: https://huggingface.co/MoritzLaurer/deberta-v3-base-zeroshot-v2.0-c longer context & multilingual: MoritzLaurer/bge-m3-zeroshot-v2.0
MoritzLaurer/roberta-large-zeroshot-v2.0-c

Zero-Shot Classification • 0.4B • Updated Apr 11, 2024 • 468 • 2

Note - Performance & speed: less performant than deberta-v3 variants, but a bit faster and compatible with flash attention and TEI containers - Size: 0.35B parameters, 711 MB - Other: only trained on commercially-friendly data; language English; context length max 512 tokens - Alternatives: Smaller, more efficient version: https://huggingface.co/MoritzLaurer/roberta-base-zeroshot-v2.0
MoritzLaurer/deberta-v3-large-zeroshot-v1.1-all-33

Zero-Shot Classification • 0.4B • Updated Apr 1, 2024 • 2.03k • 57

Note [old] Zeroshot model, trained on a mixture of 33 datasets and 389 classes reformatted in the universal NLI format. It's compatible with the Hugging Face zershot pipeline. The model is English only. You can also use it for multilingual zeroshot classification by first machine translating texts to English.
MoritzLaurer/deberta-v3-base-zeroshot-v1.1-all-33

Zero-Shot Classification • 0.2B • Updated Jun 3, 2024 • 5.79k • 28

Note [old] This is essentially the same as its larger sister MoritzLaurer/deberta-v3-large-zeroshot-v1.1-all-33 only that it's smaller. Use it if you need more speed. The model is English-only.
MoritzLaurer/deberta-v3-xsmall-zeroshot-v1.1-all-33

Zero-Shot Classification • 0.1B • Updated May 7, 2024 • 48.2k • • 4

Note [old] Same as above, just smaller/faster.
MoritzLaurer/xtremedistil-l6-h256-zeroshot-v1.1-all-33

Zero-Shot Classification • 0.0B • Updated Jan 16 • 1.66k • • 6

Note [old] Same as above, just even faster. The model only has 22 million backbone parameters. The model is 25 MB small (or 13 MB with ONNX quantization).
MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7

Zero-Shot Classification • 0.3B • Updated Apr 11, 2024 • 112k • • 326

Note [old] This model can do zeroshot classification in 100~ languages. Advice: multilingual models tend to be less good than English-only models. For maximum performance, it can be better to first machine translate texts to English and then use an English-only model for zeroshot classification. See the other English-only models in this collection. For free open-source machine translation, I recommend https://github.com/UKPLab/EasyNMT.
MoritzLaurer/mDeBERTa-v3-base-mnli-xnli

Zero-Shot Classification • 0.3B • Updated Jan 8, 2024 • 142k • • 268

Note [old] I've received some feedback from users that MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7 is less good on some languages. It might be worth trying this one too.
MoritzLaurer/DeBERTa-v3-large-mnli-fever-anli-ling-wanli

Zero-Shot Classification • 0.4B • Updated Apr 11, 2024 • 133k • • 103

Note This model is only trained on 5 NLI datasets. It might be better at specifically the NLI task compared to the "zeroshot" models and it returns three classes (entailment/contradiction/neutral). I would generally recommend the "zeroshot" models, however, as they have also been trained on the same 5 NLI datasets, only with many other additional datasets (they only return entailment/not_entailment).

Upvote

142