File size: 1,843 Bytes
d2ab339
 
 
 
 
 
 
 
 
 
 
 
be42b02
d2ab339
be42b02
269f6c8
 
be42b02
d2ab339
be42b02
269f6c8
 
 
 
be42b02
269f6c8
 
 
be42b02
d2ab339
be42b02
269f6c8
 
be42b02
d2ab339
be42b02
d2ab339
be42b02
 
d2ab339
be42b02
d2ab339
 
be42b02
d2ab339
 
 
be42b02
d2ab339
 
 
be42b02
d2ab339
 
 
be42b02
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
---
license: apache-2.0
language: en
library_name: optimum
tags:
- onnx
- quantized
- text-classification
- nvidia
- nemotron
pipeline_tag: text-classification
---

# Quantized ONNX model for botirk/tiny-prompt-task-complexity-classifier

This repository contains the quantized ONNX version of the \
[nvidia/prompt-task-and-complexity-classifier](https://huggingface.co/nvidia/prompt-task-and-complexity-classifier) model.

## Model Description

This is a multi-headed model which classifies English text prompts across task \
types and complexity dimensions. This version has been quantized to `INT8` \
using dynamic quantization with the [🤗 Optimum](https://github.com/huggingface/optimum) \
library, resulting in a smaller footprint and faster CPU inference.

For more details on the model architecture, tasks, and complexity dimensions, \
please refer to the [original model card]\
(https://huggingface.co/nvidia/prompt-task-and-complexity-classifier).

## How to Use

You can use this model directly with `optimum.onnxruntime` for accelerated \
inference.

First, install the required libraries:
```bash
pip install optimum[onnxruntime] transformers
```

Then, you can use the model in a pipeline:
```python
from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer, pipeline

repo_id = "botirk/tiny-prompt-task-complexity-classifier"
model = ORTModelForSequenceClassification.from_pretrained(repo_id)
tokenizer = AutoTokenizer.from_pretrained(repo_id)

# Note: The pipeline task is a simplification.
# For full multi-headed output, you need to process the logits manually.
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

prompt = "Write a mystery set in a small town where an everyday object goes missing."
results = classifier(prompt)
print(results)
```