marcuskd
/

norbert2_sentiment_test1

Text Classification

Norwegian Bokmål

Norwegian Nynorsk

Model card Files Files and versions

marcuskd commited on Jan 31, 2023

Commit

a771ef4

·

1 Parent(s): 0ba0e9f

Create README.md

Files changed (1) hide show

README.md +73 -0

README.md ADDED Viewed

	@@ -0,0 +1,73 @@

+---
+datasets:
+- marcuskd/reviews_binary_not4_concat
+language:
+- 'no'
+- nb
+- nn
+metrics:
+- accuracy
+- recall
+- precision
+- f1
+---
+# Model Card for Model ID
+Sentiment analysis for Norwegian reviews.
+# Model Description
+This model is trained using a self-concatinated dataset consisting of Norwegian Review Corpus dataset (https://github.com/ltgoslo/norec) and a sentiment dataset from huggingface (https://huggingface.co/datasets/sepidmnorozy/Norwegian_sentiment).
+Its purpose is merely for testing.
+- **Developed by:** Simen Aabol and Marcus Dragsten
+- **Finetuned from model:** norbert2
+# Direct Use
+Plug in Norwegian sentences to check its sentiment (negative to positive)
+# Training Details
+## Training and Testing Data
+<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+https://huggingface.co/datasets/marcuskd/reviews_binary_not4_concat
+### Preprocessing
+Tokenized using:
+```
+tokenizer = AutoTokenizer.from_pretrained("ltgoslo/norbert2")
+```
+Training arguments for this model:
+```
+training_args = TrainingArguments(
+    output_dir='./results',          # output directory
+    num_train_epochs=10,              # total number of training epochs
+    per_device_train_batch_size=16,  # batch size per device during training
+    per_device_eval_batch_size=64,   # batch size for evaluation
+    warmup_steps=500,                # number of warmup steps for learning rate scheduler
+    weight_decay=0.01,               # strength of weight decay
+    logging_dir='./logs',            # directory for storing logs
+    logging_steps=10,
+)
+```
+# Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+Evaluation by testing using test-split of dataset.
+```
+{'accuracy': 0.8357214261912695,
+ 'recall': 0.886873508353222,
+ 'precision': 0.8789025543992431,
+ 'f1': 0.8828700403896412,
+ 'total_time_in_seconds': 94.33071640000003,
+ 'samples_per_second': 31.81360340013276,
+ 'latency_in_seconds': 0.03143309443518828
+ }
+```