QQhahaha commited on
Commit
45f927b
·
1 Parent(s): 659b2d3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -8
README.md CHANGED
@@ -1,3 +1,8 @@
 
 
 
 
 
1
  # Text Summarization
2
  This is a assignment of Applied Deep Learning which is a course of National Taiwan University(NTU).
3
  ### Task Description:Chinese News Summarization (Title Generation)
@@ -20,19 +25,45 @@ output(news title):
20
  After the model generate the probility of every token as result, Greedy is the simplest way to choose the next word with most probable word(argmax).
21
  However, there is a problem that it's easy to choose the duplicate word with Greedy strategy.
22
  ```
23
- Greedy Result(f1-score):rouge-1: 15.7, rouge-2: 4.9, rouge-L: 14.8
24
  ```
25
  - Beam Search
26
  Beam Search strategy is keeping track of the k most probable sentences and finding the best one as a result.
27
  Therefore, if beam size is setting as 1, it becomes Greedy. We can say that beam search kind of solves the problem of Greedy.
28
- However, if beam size is too large, the result will turn into too generic and less relevant though the result is safe and "correct".
29
- For example
30
  ```
 
31
  I love to listen Taylor Swift's songs so I decide to participate the concert of Taylor.
 
 
 
 
 
 
32
  ```
33
  - Top k Sampling
34
-
35
- - Top p Sampling
36
-
37
- - Temperature
38
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - zh
5
+ ---
6
  # Text Summarization
7
  This is a assignment of Applied Deep Learning which is a course of National Taiwan University(NTU).
8
  ### Task Description:Chinese News Summarization (Title Generation)
 
25
  After the model generate the probility of every token as result, Greedy is the simplest way to choose the next word with most probable word(argmax).
26
  However, there is a problem that it's easy to choose the duplicate word with Greedy strategy.
27
  ```
28
+ Greedy Result(f1-score):rouge-1: 1.5, rouge-2: 0.9, rouge-L: 1.4
29
  ```
30
  - Beam Search
31
  Beam Search strategy is keeping track of the k most probable sentences and finding the best one as a result.
32
  Therefore, if beam size is setting as 1, it becomes Greedy. We can say that beam search kind of solves the problem of Greedy.
33
+ However, if beam size is too large, the result will turn into too generic and less relevant though the result is safe and "correct".
34
+ For example
35
  ```
36
+ input:
37
  I love to listen Taylor Swift's songs so I decide to participate the concert of Taylor.
38
+ output:
39
+ What do you like to listen?
40
+ ```
41
+ ```
42
+ beam size = 5
43
+ Beam Search Result(f1-score):rouge-1: 7.4, rouge-2: 1.9, rouge-L: 6.9
44
  ```
45
  - Top k Sampling
46
+ Sampling is a strategy to randomly choose the next word via the probability distribution instead of argmax.
47
+ Therefore, Top k Sampling samples the word via distribution but restricted to top-k probable words.
48
+ However, there is a problem when sampling the rarely used word, the sentence will not fluent.
49
+ ```
50
+ k = 5
51
+ Top k Result(f1-score):rouge-1: 4.0, rouge-2: 0.5, rouge-L: 3.7
52
+ ```
53
+ - Nucleus(Top p) Sampling
54
+ Nucleus Sampling is sampling from a subset of vocabulary with the most probability mass.
55
+ It can dynamically shrink and expand top-k.
56
+ ```
57
+ p = 5
58
+ Top p Result(f1-score):rouge-1: 3.0, rouge-2: 0.2, rouge-L: 2.9
59
+ ```
60
+ - Temperature
61
+ softmax temperature is applying a temperature hyperparameter to the softmax.
62
+ with high temperature: become more uniform, more diversity
63
+ with low temperature:become more spiky, less diversity
64
+ ```
65
+ temperature = 5
66
+ Temperature Result(f1-score):rouge-1: 2.1, rouge-2: 0.04, rouge-L: 1.9
67
+ ```
68
+
69
+ As the result, we can figure out that in this task, beam search outperforms other strategies.