alibaba-pai
/

Instruction-Tagger

Model card Files Files and versions Community

Bohr commited on 6 days ago

Commit

de43994

•

1 Parent(s): 69f6e29

Create README.md

Files changed (1) hide show

README.md +93 -0

README.md ADDED Viewed

	@@ -0,0 +1,93 @@

+## 📖 Introduction
+**Instruction-Tagger**  is a powerful model for labeling instructions with task tags. It allows users to easily adjust the proportion of tasks in a dataset.
+#### Example Input
+>What are the main differences between Type 1 and Type 2 diabetes, and how do their treatment approaches differ?"
+#### Example Output
+>Medicine
+## 🚀 Quick Start
+Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents.
+```python
+import torch
+from transformers import DebertaV2Tokenizer,DebertaV2ForSequenceClassification, Trainer, TrainingArguments
+model = DebertaV2ForSequenceClassification.from_pretrained('deberta_cls', num_labels=33).cuda()
+tokenizer = DebertaV2Tokenizer.from_pretrained('alibaba-pai/Instruction-Tagger')
+labels={14: 'Writting',
+ 0: 'Common-Sense',
+ 28: 'Ecology',
+ 22: 'Medicine',
+ 17: 'Grammar',
+ 3: 'Code Generation',
+ 31: 'Others',
+ 20: 'Paraphrase',
+ 19: 'Economy',
+ 6: 'Code Debug',
+ 21: 'Reasoning',
+ 18: 'Computer Science',
+ 4: 'Technology',
+ 13: 'Math',
+ 32: 'Literature',
+ 26: 'Chemistry',
+ 15: 'Complex Format',
+ 25: 'Ethics',
+ 27: 'Multilingual',
+ 29: 'Roleplay',
+ 30: 'Entertainment',
+ 23: 'Biology',
+ 16: 'Art',
+ 10: 'Academic Writing',
+ 24: 'Health',
+ 11: 'Philosophy',
+ 5: 'Sport',
+ 1: 'History',
+ 12: 'Music',
+ 7: 'Toxicity',
+ 2: 'Law',
+ 9: 'Physics',
+ 8: 'Counterfactual'}
+def task_cls(pp):
+    inputs = tokenizer(pp, return_tensors="pt",padding=True).to("cuda")
+    with torch.no_grad():
+        logits = model(**inputs).logits
+    predicted_class_id = logits.argmax().item()
+    return labels[predicted_class_id]
+instruct="""
+What are the main differences between Type 1 and Type 2 diabetes, and how do their treatment approaches differ?"
+"""
+tag=task_cls(instruct)
+```
+## 🔍 Evaluation
+To assess the accuracy of task classification, we manually evaluate a sample set of 100 entries (not in the training set), resulting in a classification precision of 92%.
+## 📜 Citation
+If you find our work helpful, please cite it!
+```
+@misc{TAPIR,
+      title={Distilling Instruction-following Abilities of Large Language Models with Task-aware Curriculum Planning},
+      author={Yuanhao Yue and Chengyu Wang and Jun Huang and Peng Wang},
+      year={2024},
+      eprint={2405.13448},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2405.13448},
+}
+```