YAML Metadata
Warning:
The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other
hadung1802/visobert-normalizer
This model is a Vietnamese text normalization model trained using the ASTRA framework with VISOBERT architecture.
Model Description
This model performs lexical normalization for Vietnamese text, converting informal text to standard Vietnamese. It was trained using the ASTRA (Self-training with Weak Supervision) framework.
Performance
Training Configuration
- Student Model: VISOBERT
- Training Mode: weakly_supervised
- Learning Rate: 0.001
- Epochs: 10
- Batch Size: 16
Usage
from transformers import AutoTokenizer, AutoModel
import torch
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("hadung1802/visobert-normalizer")
model = AutoModel.from_pretrained("hadung1802/visobert-normalizer")
# Example usage
text = "toi di hoc"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
Citation
If you use this model, please cite the ASTRA paper:
@article{astra2024,
title={ASTRA: Self-training with Weak Supervision for Vietnamese Text Normalization},
author={Your Name},
journal={arXiv preprint},
year={2024}
}
- Downloads last month
- 120