Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,55 @@
|
|
1 |
---
|
2 |
license: gpl
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: gpl
|
3 |
+
model_name: GPT2
|
4 |
+
model_type: GPT2
|
5 |
+
language: en
|
6 |
+
pipeline_tag: text-generation
|
7 |
+
tags:
|
8 |
+
- pytorch
|
9 |
+
- gpt
|
10 |
+
- gpt2
|
11 |
---
|
12 |
+
|
13 |
+
|
14 |
+
# Fine-tuning GPT2 with energy plus medical dataset
|
15 |
+
|
16 |
+
Fine tuning pre-trained language models for text generation.
|
17 |
+
|
18 |
+
Pretrained model on Chinese language using a GPT2 for Large Language Head Model objective.
|
19 |
+
|
20 |
+
## Model description
|
21 |
+
|
22 |
+
transferlearning from DavidLanz/uuu_fine_tune_taipower and fine-tuning with medical dataset for the GPT-2 architecture.
|
23 |
+
|
24 |
+
### How to use
|
25 |
+
|
26 |
+
You can use this model directly with a pipeline for text generation. Since the generation relies on some randomness, we
|
27 |
+
set a seed for reproducibility:
|
28 |
+
|
29 |
+
```python
|
30 |
+
>>> from transformers import GPT2LMHeadModel, BertTokenizer, TextGenerationPipeline
|
31 |
+
|
32 |
+
>>> model_path = "DavidLanz/DavidLanz/uuu_fine_tune_gpt2"
|
33 |
+
>>> model = GPT2LMHeadModel.from_pretrained(model_path)
|
34 |
+
>>> tokenizer = BertTokenizer.from_pretrained(model_path)
|
35 |
+
|
36 |
+
>>> max_length = 200
|
37 |
+
>>> prompt = "歐洲能源政策"
|
38 |
+
>>> text_generator = TextGenerationPipeline(model, tokenizer)
|
39 |
+
>>> text_generated = text_generator(prompt, max_length=max_length, do_sample=True)
|
40 |
+
>>> print(text_generated[0]["generated_text"].replace(" ",""))
|
41 |
+
```
|
42 |
+
|
43 |
+
```python
|
44 |
+
>>> from transformers import GPT2LMHeadModel, BertTokenizer, TextGenerationPipeline
|
45 |
+
|
46 |
+
>>> model_path = "DavidLanz/DavidLanz/uuu_fine_tune_gpt2"
|
47 |
+
>>> model = GPT2LMHeadModel.from_pretrained(model_path)
|
48 |
+
>>> tokenizer = BertTokenizer.from_pretrained(model_path)
|
49 |
+
|
50 |
+
>>> max_length = 200
|
51 |
+
>>> prompt = "蕁麻疹過敏"
|
52 |
+
>>> text_generator = TextGenerationPipeline(model, tokenizer)
|
53 |
+
>>> text_generated = text_generator(prompt, max_length=max_length, do_sample=True)
|
54 |
+
>>> print(text_generated[0]["generated_text"].replace(" ",""))
|
55 |
+
```
|