Update README
Browse files
README.md
CHANGED
|
@@ -5,21 +5,26 @@ tags:
|
|
| 5 |
- llama
|
| 6 |
---
|
| 7 |
|
| 8 |
-
#
|
| 9 |
|
| 10 |
-
OpenChat is a series of open-source language models fine-tuned on
|
| 11 |
|
| 12 |
Generic models:
|
| 13 |
|
| 14 |
-
- OpenChat: based on LLaMA-13B (
|
| 15 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
|
| 17 |
-
Code models
|
| 18 |
|
| 19 |
-
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
|
|
|
|
| 23 |
|
| 24 |
## Conversation Template
|
| 25 |
|
|
@@ -34,15 +39,13 @@ Besides base model vocabulary, an end-of-turn token `<|end_of_turn|>` is added,
|
|
| 34 |
tokenize("User:") + tokenize(user_question) + [eot_token_id] + tokenize("Assistant:")
|
| 35 |
```
|
| 36 |
|
| 37 |
-
*Hint: In BPE, `tokenize(A) + tokenize(B)` not always equals to `tokenize(A + B)`*
|
| 38 |
|
| 39 |
Following is the code for generating the conversation templates:
|
| 40 |
|
| 41 |
```python
|
| 42 |
@dataclass
|
| 43 |
class ModelConfig:
|
| 44 |
-
name: str
|
| 45 |
-
|
| 46 |
# Prompt
|
| 47 |
system: Optional[str]
|
| 48 |
|
|
@@ -51,9 +54,6 @@ class ModelConfig:
|
|
| 51 |
eot_token: str
|
| 52 |
bos_token: Optional[str] = None
|
| 53 |
|
| 54 |
-
# Tokenize
|
| 55 |
-
max_tokens: Optional[int] = None
|
| 56 |
-
|
| 57 |
# Get template
|
| 58 |
def generate_conversation_template(self, tokenize_fn, tokenize_special_fn, message_list):
|
| 59 |
tokens = []
|
|
@@ -86,19 +86,12 @@ class ModelConfig:
|
|
| 86 |
else:
|
| 87 |
assert idx == len(message_list) - 1, "Empty message for completion must be on the last."
|
| 88 |
|
| 89 |
-
# Truncate to specified tokens
|
| 90 |
-
if self.max_tokens:
|
| 91 |
-
tokens = tokens[:self.max_tokens]
|
| 92 |
-
masks = masks[:self.max_tokens]
|
| 93 |
-
|
| 94 |
return tokens, masks
|
| 95 |
|
| 96 |
|
| 97 |
MODEL_CONFIG_MAP = {
|
| 98 |
-
# OpenChat
|
| 99 |
"openchat": ModelConfig(
|
| 100 |
-
name="OpenChat",
|
| 101 |
-
|
| 102 |
# Prompt
|
| 103 |
system=None,
|
| 104 |
|
|
@@ -109,15 +102,10 @@ MODEL_CONFIG_MAP = {
|
|
| 109 |
ai_role="gpt",
|
| 110 |
eot_token="<|end_of_turn|>",
|
| 111 |
bos_token="<s>",
|
| 112 |
-
|
| 113 |
-
# Tokenize
|
| 114 |
-
max_tokens=2048
|
| 115 |
),
|
| 116 |
|
| 117 |
# OpenCoder / OpenCoderPlus
|
| 118 |
"opencoder": ModelConfig(
|
| 119 |
-
name="OpenCoder",
|
| 120 |
-
|
| 121 |
# Prompt
|
| 122 |
system=None,
|
| 123 |
|
|
@@ -128,9 +116,6 @@ MODEL_CONFIG_MAP = {
|
|
| 128 |
ai_role="gpt",
|
| 129 |
eot_token="<|end_of_turn|>",
|
| 130 |
bos_token=None,
|
| 131 |
-
|
| 132 |
-
# Tokenize
|
| 133 |
-
max_tokens=8192
|
| 134 |
)
|
| 135 |
}
|
| 136 |
```
|
|
|
|
| 5 |
- llama
|
| 6 |
---
|
| 7 |
|
| 8 |
+
# OpenChat: Less is More for Open-source Models
|
| 9 |
|
| 10 |
+
OpenChat is a series of open-source language models fine-tuned on very little diverse and high-quality multi-round conversations. The [dataset](https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset) contains only ~6K GPT-4 conversations filtered from the 90K ShareGPT conversations.
|
| 11 |
|
| 12 |
Generic models:
|
| 13 |
|
| 14 |
+
- OpenChat: based on LLaMA-13B (2048 context length)
|
| 15 |
+
- **105.7%** of ChatGPT score on Vicuna GPT-4 evaluation
|
| 16 |
+
- **80.87%** Win-rate on AlpacaEval
|
| 17 |
+
- **🚀 Only used 6K data for finetuning!!!**
|
| 18 |
+
- OpenChat-8192: based on LLaMA-13B (extended to 8192 context length)
|
| 19 |
+
- **106.6%** of ChatGPT score on Vicuna GPT-4 evaluation
|
| 20 |
|
| 21 |
+
Code models:
|
| 22 |
|
| 23 |
+
- OpenCoderPlus: based on StarCoderPlus (native 8192 context length)
|
| 24 |
+
- **102.5%** of ChatGPT score on Vicuna GPT-4 evaluation
|
| 25 |
+
- **78.70%** Win-rate on AlpacaEval
|
| 26 |
|
| 27 |
+
**NOTE:** Please load the pretrained models using *bfloat16*
|
| 28 |
|
| 29 |
## Conversation Template
|
| 30 |
|
|
|
|
| 39 |
tokenize("User:") + tokenize(user_question) + [eot_token_id] + tokenize("Assistant:")
|
| 40 |
```
|
| 41 |
|
| 42 |
+
*Hint: In BPE, `tokenize(A) + tokenize(B)` does not always equals to `tokenize(A + B)`*
|
| 43 |
|
| 44 |
Following is the code for generating the conversation templates:
|
| 45 |
|
| 46 |
```python
|
| 47 |
@dataclass
|
| 48 |
class ModelConfig:
|
|
|
|
|
|
|
| 49 |
# Prompt
|
| 50 |
system: Optional[str]
|
| 51 |
|
|
|
|
| 54 |
eot_token: str
|
| 55 |
bos_token: Optional[str] = None
|
| 56 |
|
|
|
|
|
|
|
|
|
|
| 57 |
# Get template
|
| 58 |
def generate_conversation_template(self, tokenize_fn, tokenize_special_fn, message_list):
|
| 59 |
tokens = []
|
|
|
|
| 86 |
else:
|
| 87 |
assert idx == len(message_list) - 1, "Empty message for completion must be on the last."
|
| 88 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 89 |
return tokens, masks
|
| 90 |
|
| 91 |
|
| 92 |
MODEL_CONFIG_MAP = {
|
| 93 |
+
# OpenChat / OpenChat-8192
|
| 94 |
"openchat": ModelConfig(
|
|
|
|
|
|
|
| 95 |
# Prompt
|
| 96 |
system=None,
|
| 97 |
|
|
|
|
| 102 |
ai_role="gpt",
|
| 103 |
eot_token="<|end_of_turn|>",
|
| 104 |
bos_token="<s>",
|
|
|
|
|
|
|
|
|
|
| 105 |
),
|
| 106 |
|
| 107 |
# OpenCoder / OpenCoderPlus
|
| 108 |
"opencoder": ModelConfig(
|
|
|
|
|
|
|
| 109 |
# Prompt
|
| 110 |
system=None,
|
| 111 |
|
|
|
|
| 116 |
ai_role="gpt",
|
| 117 |
eot_token="<|end_of_turn|>",
|
| 118 |
bos_token=None,
|
|
|
|
|
|
|
|
|
|
| 119 |
)
|
| 120 |
}
|
| 121 |
```
|