|
--- |
|
license: apache-2.0 |
|
language: en |
|
library_name: optimum |
|
tags: |
|
- onnx |
|
- quantized |
|
- text-classification |
|
- nvidia |
|
- nemotron |
|
pipeline_tag: text-classification |
|
--- |
|
|
|
# Quantized ONNX model for botirk/tiny-prompt-task-complexity-classifier |
|
|
|
This repository contains the quantized ONNX version of the \ |
|
[nvidia/prompt-task-and-complexity-classifier](https://huggingface.co/nvidia/prompt-task-and-complexity-classifier) model. |
|
|
|
## Model Description |
|
|
|
This is a multi-headed model which classifies English text prompts across task \ |
|
types and complexity dimensions. This version has been quantized to `INT8` \ |
|
using dynamic quantization with the [🤗 Optimum](https://github.com/huggingface/optimum) \ |
|
library, resulting in a smaller footprint and faster CPU inference. |
|
|
|
For more details on the model architecture, tasks, and complexity dimensions, \ |
|
please refer to the [original model card]\ |
|
(https://huggingface.co/nvidia/prompt-task-and-complexity-classifier). |
|
|
|
## How to Use |
|
|
|
You can use this model directly with `optimum.onnxruntime` for accelerated \ |
|
inference. |
|
|
|
First, install the required libraries: |
|
```bash |
|
pip install optimum[onnxruntime] transformers |
|
``` |
|
|
|
Then, you can use the model in a pipeline: |
|
```python |
|
from optimum.onnxruntime import ORTModelForSequenceClassification |
|
from transformers import AutoTokenizer, pipeline |
|
|
|
repo_id = "botirk/tiny-prompt-task-complexity-classifier" |
|
model = ORTModelForSequenceClassification.from_pretrained(repo_id) |
|
tokenizer = AutoTokenizer.from_pretrained(repo_id) |
|
|
|
# Note: The pipeline task is a simplification. |
|
# For full multi-headed output, you need to process the logits manually. |
|
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer) |
|
|
|
prompt = "Write a mystery set in a small town where an everyday object goes missing." |
|
results = classifier(prompt) |
|
print(results) |
|
``` |
|
|