colli98 commited on
Commit
1434b5b
·
verified ·
1 Parent(s): 931fadc

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +120 -0
README.md ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - ko
5
+ - en
6
+ - zh
7
+ base_model:
8
+ - Qwen/Qwen3-1.7B
9
+ pipeline_tag: summarization
10
+ tags:
11
+ - qwen3
12
+ - korean
13
+ - summary
14
+ - summarization
15
+ - ko
16
+ ---
17
+
18
+ # qwen3-1.7B-ko-summary-finetuned-06-12
19
+
20
+ A fine-tuned Qwen3-1.7B model specialized for abstractive summarization of Korean documents, particularly academic papers. This model was trained on high-quality Korean paper summarization data and enhanced with emotional multi-turn conversation data to expand vocabulary and improve generation quality.
21
+
22
+ ## Model Description
23
+
24
+ - **Architecture**: Qwen3-1.7B
25
+ - **Fine-tuning Task**: Abstractive summarization
26
+ - **Training Data**: Korean academic paper summaries (e.g., KoreaScience dataset) + Emotional multi-turn conversation data
27
+
28
+ ## Key Improvements
29
+
30
+ 1. **Resolved Token Repetition Issue**: Fixed meaningless token repetition problems from the previous colli98/qwen3-1.7B-ko-summary-finetuned model
31
+ 2. **Structured Summary Format**: Improved unstructured summary format issues for better coherence
32
+ 3. **Enhanced Vocabulary**: Added emotional multi-turn conversation training data to expand vocabulary range beyond academic papers
33
+
34
+ ## Intended Use
35
+
36
+ - Summarizing long Korean documents—especially research papers—into clear, concise overviews.
37
+ - Integrating into research tools, educational platforms, or automated document-processing pipelines.
38
+
39
+ ## Performance Evaluation
40
+
41
+ ### ROUGE Score Comparison
42
+
43
+ | Metric | Previous Model | Current Model | Improvement |
44
+ | ------------------------ | -------------- | ------------- | ----------- |
45
+ | **ROUGE-1 Precision** | 0.357 | 0.388 | **+8.7%** |
46
+ | **ROUGE-1 Recall** | 0.189 | 0.174 | -7.9% |
47
+ | **ROUGE-1 F-measure** | 0.247 | 0.241 | -2.4% |
48
+ | **ROUGE-2 Precision** | 0.109 | 0.169 | **+55.0%** |
49
+ | **ROUGE-2 Recall** | 0.058 | 0.076 | **+31.1%** |
50
+ | **ROUGE-2 F-measure** | 0.075 | 0.104 | **+38.7%** |
51
+ | **ROUGE-L Precision** | 0.269 | 0.328 | **+21.9%** |
52
+ | **ROUGE-L Recall** | 0.142 | 0.147 | **+3.5%** |
53
+ | **ROUGE-L F-measure** | 0.186 | 0.203 | **+9.1%** |
54
+ | **ROUGE-Lsum Precision** | 0.316 | 0.319 | **+0.9%** |
55
+ | **ROUGE-Lsum Recall** | 0.168 | 0.171 | **+1.8%** |
56
+ | **ROUGE-Lsum F-measure** | 0.219 | 0.223 | **+1.8%** |
57
+
58
+ ### Performance Analysis
59
+
60
+ **Positive Improvements:**
61
+
62
+ - **Overall Precision Enhancement**: Improved precision across all metrics, indicating higher quality generated content
63
+ - **Significant ROUGE-2 Improvement**: Major improvement in bigram-level metrics, suggesting more natural and coherent sentence structure generation
64
+
65
+ **Trade-offs:**
66
+
67
+ - **Partial Recall Decrease**: Slight decrease in recall, particularly in ROUGE-1, suggesting potential missed content from reference texts
68
+ - **Room for Further Improvement**: All metrics remain below 0.4, indicating need for additional performance enhancements
69
+
70
+ **Conclusion**: Fine-tuning improved **generation quality (precision)** while showing slight trade-offs in **completeness (recall)**. The significant ROUGE-2 improvement represents meaningful progress in model performance.
71
+
72
+ ![ROUGE Score Comparison](rouge_comparison_chart.png)
73
+
74
+ ## Limitations & Risks
75
+
76
+ - May produce inaccuracies or hallucinated content.
77
+ - Not intended for generating verbatim legal/medical texts or for extractive quotation.
78
+ - Users should verify critical facts against original sources.
79
+
80
+ ## Installation
81
+
82
+ ```bash
83
+ pip install transformers safetensors
84
+ ```
85
+
86
+ ## Usage
87
+
88
+ ```python
89
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
90
+
91
+ tokenizer = AutoTokenizer.from_pretrained("your-username/qwen3-1.7B-ko-summary-finetuned-06-12")
92
+ model = AutoModelForSeq2SeqLM.from_pretrained("your-username/qwen3-1.7B-ko-summary-finetuned-06-12")
93
+
94
+ text = "여기에 긴 한국어 논문 텍스트를 입력하세요..."
95
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, padding="longest")
96
+ summary_ids = model.generate(
97
+ **inputs,
98
+ max_length=150,
99
+ num_beams=4,
100
+ early_stopping=True
101
+ )
102
+ summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
103
+ print(summary)
104
+ ```
105
+
106
+ ## Files in This Repository
107
+
108
+ ```bash
109
+ .
110
+ ├── config.json
111
+ ├── generation_config.json
112
+ ├── model.safetensors
113
+ ├── model.safetensors.index.json
114
+ ├── tokenizer.json
115
+ ├── tokenizer_config.json
116
+ ├── special_tokens_map.json
117
+ ├── vocab.json
118
+ ├── merges.txt
119
+ └── added_tokens.json
120
+ ```