Text Generation
Transformers
PyTorch
Safetensors
llama
text-generation-inference
mfromm commited on
Commit
b8a7fec
·
verified ·
1 Parent(s): 9bc452a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -31,9 +31,9 @@ pipeline_tag: text-generation
31
  library_name: transformers
32
  license: apache-2.0
33
  ---
34
- # Model Card for Teuken-7B-base-v0.6
35
 
36
- Teuken-7B-base-v0.6 is a 7B parameter multilingual large language model (LLM) pre-trained with 6T tokens within the research project OpenGPT-X.
37
 
38
 
39
  ### Model Description
@@ -49,7 +49,7 @@ Teuken-7B-base-v0.6 is a 7B parameter multilingual large language model (LLM) pr
49
  ## Uses
50
 
51
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
52
- Teuken-7B-base-v0.6 is intended for commercial and research use in all official 24 European languages. Since Teuken-7B-base-v0.6
53
  focuses on covering all 24 EU languages, it renders more stable results across these languages and better reflects European values in its answers than English-centric models. It is therefore specialized for use in multilingual tasks.
54
 
55
  ### Out-of-Scope Use
@@ -62,7 +62,7 @@ The model is not intended for use in math and coding tasks.
62
 
63
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
64
 
65
- Teuken-7B-base-v0.6 as a base model is not free from biases and hallucinations. It is therefore recommended to instruction tune it to fit it to the user's purposes and minimize biases and any risks arising. Finetuned models limiting risks and biases will appear soon after the release of the base model as a community effort.
66
 
67
  ## How to Get Started with the Model
68
 
@@ -74,7 +74,7 @@ After installation, here's an example of how to use the model:
74
  import torch
75
  from transformers import AutoModelForCausalLM, AutoTokenizer
76
 
77
- model_name = "openGPT-X/Teuken-7B-base-v0.6"
78
  prompt = "Insert text here..."
79
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
80
  tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
@@ -93,7 +93,7 @@ This example demonstrates how to load the model and tokenizer, prepare input, ge
93
 
94
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
95
 
96
- Teuken-7B-base-v0.6 was pre-trained on 5.5 trillion tokens of data from publicly available sources.
97
 
98
  The pretraining data has a cutoff of September 2023.
99
 
 
31
  library_name: transformers
32
  license: apache-2.0
33
  ---
34
+ # Model Card for Teuken 7B-base-v0.6
35
 
36
+ Teuken 7B-base-v0.6 is a 7B parameter multilingual large language model (LLM) pre-trained with 6T tokens within the research project OpenGPT-X.
37
 
38
 
39
  ### Model Description
 
49
  ## Uses
50
 
51
  <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
52
+ Teuken 7B-base-v0.6 is intended for commercial and research use in all official 24 European languages. Since Teuken 7B-base-v0.6
53
  focuses on covering all 24 EU languages, it renders more stable results across these languages and better reflects European values in its answers than English-centric models. It is therefore specialized for use in multilingual tasks.
54
 
55
  ### Out-of-Scope Use
 
62
 
63
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
64
 
65
+ Teuken 7B-base-v0.6 as a base model is not free from biases and hallucinations. It is therefore recommended to instruction tune it to fit it to the user's purposes and minimize biases and any risks arising. Finetuned models limiting risks and biases will appear soon after the release of the base model as a community effort.
66
 
67
  ## How to Get Started with the Model
68
 
 
74
  import torch
75
  from transformers import AutoModelForCausalLM, AutoTokenizer
76
 
77
+ model_name = "openGPT-X/Teuken 7B-base-v0.6"
78
  prompt = "Insert text here..."
79
  device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
80
  tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
 
93
 
94
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
95
 
96
+ Teuken 7B-base-v0.6 was pre-trained on 5.5 trillion tokens of data from publicly available sources.
97
 
98
  The pretraining data has a cutoff of September 2023.
99