Delete README.md
Browse filesМаленькая LLM для генерации несмешных шуток (пока что). Обучена на датасете [RussianJokes](https://huggingface.co/datasets/IgorVolochay/russian_jokes). Создана в рамках учебного проекта VK education.
# Архитектура:
10.55M параметров, SwiGLU, GQA, ALiBi, byte-level BPE
- n_layer=6
- n_head=6
- n_kv_head=3
- hidden_dim=384
- intermediate_dim=1024
# Как использовать
```
device = torch.device("cuda")
tokenizer = ByteLevelBPETokenizer.from_pretrained(REPO_NAME)
check_model = TransformerForCausalLM.from_pretrained(REPO_NAME)
check_model = check_model.to(device)
check_model = check_model.eval()
text = "Штирлиц пришел домой"
input_ids = torch.tensor(tokenizer.encode(text), device=device)
model_output = check_model.generate(
input_ids[None, :], max_new_tokens=200, eos_token_id=tokenizer.eos_token_id, do_sample=True, top_k=10
)
tokenizer.decode(model_output[0].tolist())
```
Output:
```
Штирлиц пришел домой к врачу и видит, что он пришел с ней.
```
@@ -1,15 +0,0 @@
|
|
1 |
-
---
|
2 |
-
tags:
|
3 |
-
- model_hub_mixin
|
4 |
-
- pytorch_model_hub_mixin
|
5 |
-
license: mit
|
6 |
-
datasets:
|
7 |
-
- IgorVolochay/russian_jokes
|
8 |
-
language:
|
9 |
-
- ru
|
10 |
-
pipeline_tag: text-generation
|
11 |
-
---
|
12 |
-
|
13 |
-
This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
|
14 |
-
- Library: [More Information Needed]
|
15 |
-
- Docs: [More Information Needed]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|