YAML Metadata
Warning:
The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other
# ArabicT5 Model for Arabic News Classification and Generation
- In this model focus on classifying and generating news Arabic.
# The number in the generated text represents the category of the news, as shown below:
category_mapping = {
'Political':1,
'Economy':2,
'Health':3,
'Sport':4,
'Culture':5,
'Technology':6,
'Art':7,
'Accidents':8
}
# Training parameters
Training batch size | 8 |
Evaluation batch size | 8 |
Learning rate | 1e-4 |
Max length input | 64 |
Max length target | 200 |
Number workers | 4 |
Epoch | 5 |
# Results
Training Loss | 3.20 |
Classification Accuracy | 95.7% |
Generation Accuracy | 88.87% |
# Example usage
from transformers import T5ForConditionalGeneration, T5Tokenizer, pipeline
model_name = "Hezam/ArabicT5-49GB-small-classification-generation"
model = T5ForConditionalGeneration.from_pretrained(model_name)
tokenizer = T5Tokenizer.from_pretrained(model_name)
generation_pipeline = pipeline("text2text-generation",model=model,tokenizer=tokenizer)
text = "أوقفوا القتل الجماعي في غزة"
output= generation_pipeline(text,
num_beams=10,
max_length=200,
top_p=0.9,
repetition_penalty = 3.0,
no_repeat_ngram_size = 3)[0]["generated_text"]
output
category: 1 article: كتب عبد اللطيف صبح قال الرءيس الفلسطيني محمود عباس في تصريح ل اليوم السابع وقفوا القتل الجماعي في مدينه غزة مءكدا يجب يوقفوا قتل المدنيين العزل في قطاعي غزة والضفه وغزه واوقفوا القتل الجماع
bash
category: 1 article: كتب عبد اللطيف صبح قال الرءيس الفلسطيني محمود عباس في تصريح ل اليوم السابع وقفوا القتل الجماعي في مدينه غزة مءكدا يجب يوقفوا قتل المدنيين العزل في قطاعي غزة والضفه وغزه واوقفوا القتل الجماع
- Downloads last month
- 89