yjf9966 commited on
Commit
8390563
·
1 Parent(s): 1b4f9bc

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +130 -0
README.md ADDED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - BAAI/COIG-PC
5
+ language:
6
+ - zh
7
+ library_name: transformers
8
+ pipeline_tag: question-answering
9
+ ---
10
+
11
+ # Model Card for Model ID
12
+
13
+ <!-- Provide a quick summary of what the model is/does. -->
14
+
15
+ This is an experimental product that can be used to create new LLM bassed on Chinese language. It has been generated using [Chinese-LLaMA-Alpaca](https://github.com/ymcui/Chinese-LLaMA-Alpaca)
16
+ ## Model Details
17
+
18
+ ### Model Description
19
+
20
+ <!-- Provide a longer summary of what this model is. -->
21
+
22
+
23
+ - **Developed by:** yjf9966
24
+ - **Model type:** LLaMA with enhanced tokenizer-size-49964
25
+ - **Language(s) (NLP):** Chinese
26
+ - **License:** Apache 2.0
27
+ - **Finetuned from model:** [Chinese-LLaMA-Alpaca](https://github.com/ymcui/Chinese-LLaMA-Alpaca)
28
+
29
+ ### Model Sources [optional]
30
+
31
+ <!-- Provide the basic links for the model. -->
32
+
33
+ - **Repository:** https://huggingface.co/BlueWhaleX/Chinese-Alpaca-COIG-49954-13B-HF
34
+
35
+ ## Uses
36
+
37
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
38
+
39
+ You can use the raw model for next sentence prediction, but it's mostly intended to be fine-tuned on a downstream task.
40
+ Note that this model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked) to make decisions, such as sequence classification, token classification or question answering.
41
+
42
+
43
+ ## Bias, Risks, and Limitations
44
+
45
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
46
+
47
+ Even if the training data used for this model could be characterized as fairly neutral, this model can have biased predictions.
48
+ It also inherits some of the bias of its dataset model.
49
+
50
+ ### Recommendations
51
+
52
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
53
+
54
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
55
+
56
+ ## How to Get Started with the Model
57
+
58
+ Use the code below to get started with the model.
59
+
60
+ ```
61
+ import torch
62
+ import transformers
63
+ from transformers import LlamaTokenizer, LlamaForCausalLM
64
+
65
+ def generate_prompt(text):
66
+ return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n" +
67
+ ### Instruction:\n\n{text}\n\n### Response:\n\n"""
68
+
69
+ tokenizer = LlamaTokenizer.from_pretrained('BlueWhaleX/Chinese-Alpaca-COIG-49954-13B-HF')
70
+ model = LlamaForCausalLM.from_pretrained('BlueWhaleX/Chinese-Alpaca-COIG-49954-13B-HF').half().cuda()
71
+ model.eval()
72
+
73
+ text = '王国维说:“自周之衰,文王、周公势力之瓦解也,国民之智力成熟于内,政治之纷乱乘之于外,上无统一之制度,下迫于社会之要求,于是诸于九流各创其学说。” 他意在说明 A. 分封制的崩溃 B. 商鞅变法的作用 C. 兼并战争的后果 D. 百家争鸣的原因'
74
+ prompt = generate_prompt(text)
75
+ input_ids = tokenizer.encode(prompt, return_tensors='pt').to('cuda')
76
+
77
+ with torch.no_grad():
78
+ output_ids = model.generate(
79
+ input_ids=input_ids,
80
+ max_new_tokens=400,
81
+ temperature=0.2,
82
+ top_k=40,
83
+ top_p=0.9,
84
+ repetition_penalty=1.3
85
+ ).cuda()
86
+ output = tokenizer.decode(output_ids[0], skip_special_tokens=True)
87
+ response = output.split("### Response:")[1].strip()
88
+ print("Response: ", response, '\n')
89
+ ```
90
+
91
+
92
+ ## Training Details
93
+
94
+ ### Training Data
95
+
96
+ <!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
97
+
98
+ BAAI/COIG-PC
99
+
100
+ ### Training Procedure
101
+
102
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
103
+
104
+ #### Preprocessing [optional]
105
+
106
+ 80% for train dataset and 20% for test dataset
107
+
108
+
109
+ #### Training Hyperparameters
110
+
111
+ - **Training regime:** fp16 mixed precision, lr=1e-4, lora_rank=8, lora_alpha=32 <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
112
+
113
+
114
+ ## Evaluation
115
+
116
+ #### Testing Data
117
+
118
+ <!-- This should link to a Data Card if possible. -->
119
+ 20% of the BAAI/COIG-PC dataset.
120
+
121
+ ## Citation
122
+
123
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
124
+ ```
125
+ @software{Chinese-Alpaca-COIG-49954-13B-HF,
126
+ title={An Enchanced Chinese Language Model based on the Chinese-Alpaca},
127
+ url={https://huggingface.co/BlueWhaleX/Chinese-Alpaca-COIG-49954-13B-HF},
128
+ year={2023}
129
+ }
130
+ ```