EttinX-nli-xxs / README.md
dleemiller's picture
Update README.md
04ded68 verified
---
language:
- en
tags:
- sentence-transformers
- cross-encoder
- reranker
- generated_from_trainer
- dataset_size:942069
- loss:PrecomputedDistillationLoss
base_model: jhu-clsp/ettin-encoder-17m
datasets:
- dleemiller/all-nli-distill
pipeline_tag: text-classification
library_name: sentence-transformers
metrics:
- f1_macro
- f1_micro
- f1_weighted
model-index:
- name: CrossEncoder based on jhu-clsp/ettin-encoder-17m
results:
- task:
type: cross-encoder-classification
name: Cross Encoder Classification
dataset:
name: AllNLI dev
type: AllNLI-dev
metrics:
- type: f1_macro
value: 0.843215238686306
name: F1 Macro
- type: f1_micro
value: 0.8435163046243068
name: F1 Micro
- type: f1_weighted
value: 0.8438547382511594
name: F1 Weighted
- task:
type: cross-encoder-classification
name: Cross Encoder Classification
dataset:
name: AllNLI test
type: AllNLI-test
metrics:
- type: f1_macro
value: 0.8442865676487733
name: F1 Macro
- type: f1_micro
value: 0.8446784696784697
name: F1 Micro
- type: f1_weighted
value: 0.8449960204914074
name: F1 Weighted
---
# EttinX Cross-Encoder: Natural Language Inference (NLI)
This cross encoder performs sequence classification for contradiction/neutral/entailment labels. This has
drop-in compatibility with comparable sentence transformers cross encoders.
To train this model, I added teacher logits to the all-nli dataset `dleemiller/all-nli-distill` from the
`dleemiller/ModernCE-large-nli` model. This significantly improves performance above standard training.
This 17m architecture is based on ModernBERT and is an excellent candidate for lightweight **CPU inference**.
---
## Features
- **High performing:** Achieves **80.47%** and **86.95%** (Micro F1) on MNLI mismatched and SNLI test.
- **Efficient architecture:** Based on the Ettin-17m encoder design (17M parameters), offering faster inference speeds.
- **Extended context length:** Processes sequences up to 8192 tokens, great for LLM output evals.
---
## Performance
| Model | MNLI Mismatched | SNLI Test | Context Length | # Parameters |
|---------------------------|-------------------|--------------|----------------|----------------|
| `dleemiller/ModernCE-large-nli` | **0.9202** | 0.9110 | 8192 | 395M |
| `dleemiller/ModernCE-base-nli` | 0.9034 | 0.9025 | 8192 | 149M |
| `cross-encoder/deberta-v3-large` | 0.9049 | 0.9220 | 512 | 435M |
| `cross-encoder/deberta-v3-base` | 0.9004 | 0.9234 | 512 | 184M |
| `cross-encoder/nli-distilroberta-base` | 0.8398 | 0.8838 | 512 | 82M |
| `dleemiller/EttinX-nli-xxs` | 0.8047 | 0.8695 | 8192 | 17M |
---
## Usage
To use EttinX for NLI tasks, you can load the model with the Hugging Face `sentence-transformers` library:
```python
from sentence_transformers import CrossEncoder
# Load EttinX model
model = CrossEncoder("dleemiller/EttinX-nli-xxs")
scores = model.predict([
('A man is eating pizza', 'A man eats something'),
('A black race car starts up in front of a crowd of people.', 'A man is driving down a lonely road.')
])
# Convert scores to labels
label_mapping = ['contradiction', 'entailment', 'neutral']
labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)]
# ['entailment', 'contradiction']
```
---
## Training Details
### Pretraining
We initialize the `` weights.
Details:
- Batch size: 512
- Learning rate: 1e-4
- **Attention Dropout:** attention dropout 0.1
### Fine-Tuning
Fine-tuning was performed on the `dleemiller/all-nli-distill` dataset.
### Validation Results
The model achieved the following test set micro f1 performance after fine-tuning:
- **MNLI Unmatched:** 0.8047
- **SNLI:** 0.8695
---
## Model Card
- **Architecture:** Ettin-encoder-17m
- **Fine-Tuning Data:** `dleemiller/all-nli-distill`
---
## Thank You
Thanks to the Johns Hopkins team for providing the ModernBERT models, and the Sentence Transformers team for their leadership in transformer encoder models.
---
## Citation
If you use this model in your research, please cite:
```bibtex
@misc{moderncenli2025,
author = {Miller, D. Lee},
title = {EttinX NLI: An NLI cross encoder model},
year = {2025},
publisher = {Hugging Face Hub},
url = {https://huggingface.co/dleemiller/EttinX-nli-xxs},
}
```
---
## License
This model is licensed under the [MIT License](LICENSE).