File size: 3,639 Bytes
add3bc1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5c76455
add3bc1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ccfef63
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
license: apache-2.0
---
## Dataset Domain {Restaurant + Laptop} Reviews 

## Overview
This work is based on [Grid Tagging Scheme for Aspect-oriented Fine-grained Opinion Extraction](https://aclanthology.org/2020.findings-emnlp.234/).The code from 
their [github repository](https://github.com/NJUNLP/GTS) was also utilized along with their dataset. 

This model requires custom code as it uses GridTaggingScheme to predict the labels on the input. For the convenience, 
the custom code and model architecture has been included with the model.

## Example Code for inferencing

### STEP 1 (Installing huggingface lib)

```bash
pip install --upgrade huggingface_hub
```

### STEP 2 (Download the custom code and model to predict opinion target, opinion span and sentiment polarity)
```python

from huggingface_hub import hf_hub_download
import sys
# Download the custom model code
bert_gts_pretrained = hf_hub_download(repo_id='gauneg/bert-gts-absa-triple', filename="bert_opinion.py")
post =  hf_hub_download(repo_id='gauneg/bert-gts-absa-triple', filename="post.py")

sys.path.append(bert_gts_pretrained.rsplit("/", 1)[0])
sys.path.append(post.rsplit("/", 1)[0])


from bert_opinion import BertGTSOpinionTriple
from post import DecodeAndEvaluate


from transformers import AutoTokenizer


model_id = 'gauneg/bert-gts-absa-triple'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = BertGTSOpinionTriple.from_pretrained(model_id)
dec_and_infer = DecodeAndEvaluate(tokenizer)
test_sentence0 = """I charge it at night and skip taking the cord with me because of the good battery life ."""
test_sentence = "The Dell Inspiron 14 Plus is the most well-rounded laptop with great display and battery life that money can buy."


# prediction
print(dec_and_infer.decode_predict_string_one(test_sentence, model, max_len=128))

```
Expected output

```bash
[['display', 'great', 'positive'], ['battery life', 'great', 'positive']]
```

# DETAILS
The model has been trained to use Grid Tagging Scheme (GTS) to predict `Opinion Target`, `Opinion Span` and `Sentiment Polarity`. For the purpose of training this model the domain specific datasets (laptop and restaurant reviews) were combined. The grid tagging example is shown 
in the following diagram:

<figure>
  <img src="./gts_pic.png" alt="gts-image" style="width:45%">
  <figcaption>Fig 1. Grid tagging Scheme from <a href="https://aclanthology.org/2020.findings-emnlp.234/">(Wu et al., Findings 2020)</a> </figcaption>
</figure>

In the above sentence there are two absa triples. Each triple is expressed in the following order:

[<span style="color:red">Aspect Term/Opinion Target</span>, <span style="color:#7393B3">opinion span</span>, <span style="color:purple">sentiment polarity</span>]

The model and sample code as shown in the snippet with extract opinion triplets as: [
[<span style="color:red">hot dogs</span>, <span style="color:#7393B3">top notch</span>, <span style="color:purple">positive</span>],
[<span style="color:red">coffee</span>, <span style="color:#7393B3">avergae</span>, <span style="color:purple">neutral</span>]
]

Definitions <a href="https://aclanthology.org/2020.findings-emnlp.234/">(Wu et al., Findings 2020)</a>:

1. <span style="color:red">Aspect Term/Opinion Target</span>: Aspect term, also known as opinion target, is the word or phrase in a sentence representing feature or entity of products or services.
2. <span style="color:#7393B3">Opinion Term </span>: Opinion Term refers to the term in a sentence used to express attitudes or opinions explicitly.
3. <span style="color:purple">Sentiment Polarity</span>: This is the sentiment expressed.