dhanikitkat commited on
Commit
005bcd0
·
verified ·
1 Parent(s): ac452dc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -54
README.md CHANGED
@@ -9,62 +9,8 @@ widget:
9
  - text: "Jangan sampai saya telpon bos saya ya!"
10
  ---
11
 
12
- ## Indonesian RoBERTa Base Sentiment Classifier
13
-
14
- Indonesian RoBERTa Base Sentiment Classifier is a sentiment-text-classification model based on the [RoBERTa](https://arxiv.org/abs/1907.11692) model. The model was originally the pre-trained [Indonesian RoBERTa Base](https://hf.co/flax-community/indonesian-roberta-base) model, which is then fine-tuned on [`indonlu`](https://hf.co/datasets/indonlu)'s `SmSA` dataset consisting of Indonesian comments and reviews.
15
-
16
- After training, the model achieved an evaluation accuracy of 94.36% and F1-macro of 92.42%. On the benchmark test set, the model achieved an accuracy of 93.2% and F1-macro of 91.02%.
17
-
18
- Hugging Face's `Trainer` class from the [Transformers](https://huggingface.co/transformers) library was used to train the model. PyTorch was used as the backend framework during training, but the model remains compatible with other frameworks nonetheless.
19
-
20
- ## Model
21
-
22
- | Model | #params | Arch. | Training/Validation data (text) |
23
- | ---------------------------------------------- | ------- | ------------ | ------------------------------- |
24
- | `indonesian-roberta-base-sentiment-classifier` | 124M | RoBERTa Base | `SmSA` |
25
-
26
- ## Evaluation Results
27
-
28
- The model was trained for 5 epochs and the best model was loaded at the end.
29
-
30
- | Epoch | Training Loss | Validation Loss | Accuracy | F1 | Precision | Recall |
31
- | ----- | ------------- | --------------- | -------- | -------- | --------- | -------- |
32
- | 1 | 0.342600 | 0.213551 | 0.928571 | 0.898539 | 0.909803 | 0.890694 |
33
- | 2 | 0.190700 | 0.213466 | 0.934127 | 0.901135 | 0.925297 | 0.882757 |
34
- | 3 | 0.125500 | 0.219539 | 0.942857 | 0.920901 | 0.927511 | 0.915193 |
35
- | 4 | 0.083600 | 0.235232 | 0.943651 | 0.924227 | 0.926494 | 0.922048 |
36
- | 5 | 0.059200 | 0.262473 | 0.942063 | 0.920583 | 0.924084 | 0.917351 |
37
-
38
- ## How to Use
39
-
40
- ### As Text Classifier
41
-
42
- ```python
43
- from transformers import pipeline
44
-
45
- pretrained_name = "w11wo/indonesian-roberta-base-sentiment-classifier"
46
-
47
- nlp = pipeline(
48
- "sentiment-analysis",
49
- model=pretrained_name,
50
- tokenizer=pretrained_name
51
- )
52
-
53
- nlp("Jangan sampai saya telpon bos saya ya!")
54
- ```
55
-
56
- ## Disclaimer
57
-
58
- Do consider the biases which come from both the pre-trained RoBERTa model and the `SmSA` dataset that may be carried over into the results of this model.
59
-
60
- ## Author
61
-
62
- Indonesian RoBERTa Base Sentiment Classifier was trained and evaluated by [Wilson Wongso](https://w11wo.github.io/). All computation and development are done on Google Colaboratory using their free GPU access.
63
-
64
  ## Citation
65
 
66
- If used, please cite the following:
67
-
68
  ```bibtex
69
  @misc {wilson_wongso_2023,
70
  author = { {Wilson Wongso} },
 
9
  - text: "Jangan sampai saya telpon bos saya ya!"
10
  ---
11
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ## Citation
13
 
 
 
14
  ```bibtex
15
  @misc {wilson_wongso_2023,
16
  author = { {Wilson Wongso} },