File size: 1,843 Bytes
d2ab339 be42b02 d2ab339 be42b02 269f6c8 be42b02 d2ab339 be42b02 269f6c8 be42b02 269f6c8 be42b02 d2ab339 be42b02 269f6c8 be42b02 d2ab339 be42b02 d2ab339 be42b02 d2ab339 be42b02 d2ab339 be42b02 d2ab339 be42b02 d2ab339 be42b02 d2ab339 be42b02 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
---
license: apache-2.0
language: en
library_name: optimum
tags:
- onnx
- quantized
- text-classification
- nvidia
- nemotron
pipeline_tag: text-classification
---
# Quantized ONNX model for botirk/tiny-prompt-task-complexity-classifier
This repository contains the quantized ONNX version of the \
[nvidia/prompt-task-and-complexity-classifier](https://huggingface.co/nvidia/prompt-task-and-complexity-classifier) model.
## Model Description
This is a multi-headed model which classifies English text prompts across task \
types and complexity dimensions. This version has been quantized to `INT8` \
using dynamic quantization with the [🤗 Optimum](https://github.com/huggingface/optimum) \
library, resulting in a smaller footprint and faster CPU inference.
For more details on the model architecture, tasks, and complexity dimensions, \
please refer to the [original model card]\
(https://huggingface.co/nvidia/prompt-task-and-complexity-classifier).
## How to Use
You can use this model directly with `optimum.onnxruntime` for accelerated \
inference.
First, install the required libraries:
```bash
pip install optimum[onnxruntime] transformers
```
Then, you can use the model in a pipeline:
```python
from optimum.onnxruntime import ORTModelForSequenceClassification
from transformers import AutoTokenizer, pipeline
repo_id = "botirk/tiny-prompt-task-complexity-classifier"
model = ORTModelForSequenceClassification.from_pretrained(repo_id)
tokenizer = AutoTokenizer.from_pretrained(repo_id)
# Note: The pipeline task is a simplification.
# For full multi-headed output, you need to process the logits manually.
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
prompt = "Write a mystery set in a small town where an everyday object goes missing."
results = classifier(prompt)
print(results)
```
|