File size: 2,401 Bytes
62d5850
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
---
license: mit
tags:
  - text-classification
  - prompt-filtering
  - moderation
  - distilbert
  - transformers
datasets:
  - VerifiedPrompts/cntxt-class-final
language:
  - en
pipeline_tag: text-classification
widget:
  - text: "Write a LinkedIn post about eco-friendly tech for Gen Z entrepreneurs."
    example_title: Context-rich prompt
  - text: "Write something"
    example_title: Vague prompt
---
# πŸ“˜ Model Card: CNTXT-Filter-Prompt-Opt

## πŸ” Model Overview
**CNTXT-Filter-Prompt-Opt** is a lightweight, high-accuracy text classification model designed to evaluate the **contextual completeness of user prompts** submitted to LLMs.  
It acts as a **gatekeeper** before generation, helping eliminate vague or spam-like input and ensuring only quality prompts proceed to LLM2.

- **Base model**: `distilbert-base-uncased`
- **Trained on**: 200k labeled prompts
- **Purpose**: Prompt validation, spam filtering, and context enforcement

---

## 🎯 Intended Use

This model is intended for:
- Pre-processing prompts before LLM2 generation
- Blocking unclear or context-poor requests
- Structuring user input pipelines in AI apps, bots, and assistants

---

## πŸ”’ Labels

The model classifies prompts into 3 categories:

| Label | Description |
|-------|-------------|
| `has context` | Prompt is clear, actionable, and self-contained |
| `missing platform, audience, budget, goal` | Prompt lacks structural clarity |
| `Intent is unclear, Please input more context` | Vague or incoherent prompt |

---

## πŸ“Š Training Details

- **Model**: `distilbert-base-uncased`
- **Training method**: Hugging Face AutoTrain
- **Dataset size**: 200,000 prompts (curated, curriculum style)
- **Epochs**: 3  
- **Batch size**: 8  
- **Max seq length**: 128  
- **Mixed Precision**: `fp16`  
- **LoRA**: ❌ Disabled  
- **Optimizer**: AdamW

---

## βœ… Evaluation

| Metric | Score |
|--------|-------|
| Accuracy | 1.0 |
| F1 (macro/micro/weighted) | 1.0 |
| Precision / Recall | 1.0 |
| Validation Loss | 0.0 |

The model generalizes extremely well on all validation samples.

---

## βš™οΈ How to Use

```python
from transformers import pipeline

classifier = pipeline("text-classification", model="VerifiedPrompts/CNTXT-Filter-Prompt-Opt")
prompt = "Write a business plan for a freelance app in Canada."
result = classifier(prompt)

print(result)
# [{'label': 'has context', 'score': 0.98}]