Developed by: datatab
License: mit

🏆 Results

Results obtained through the Serbian LLM Evaluation Benchmark

MODEL	ARC-E	ARC-C	Hellaswag	PiQA	Winogrande	BoolQ	OpenbookQA	OZ_EVAL	SCORE
YugoGPT-Florida	0.6918	0.5766	0.4037	0.7374	0.5782	0.8685	0.5918	0.7407	64,85875
Yugo55A-GPT	0.5846	0.5185	0.3686	0.7076	0.5277	0.8584	0.5485	0.6883	60,0275
Yugo60-GPT	0.4948	0.4542	0.3342	0.6897	0.5138	0.8212	0.5155	0.6379	55,76625
Yugo45-GPT	0.4049	0.3900	0.2812	0.6055	0.4992	0.5793	0.4433	0.6111	47,68125

🏋️ Training Stats

💻 Usage

!pip -q install git+https://github.com/huggingface/transformers

from IPython.display import HTML, display

def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))
get_ipython().events.register('pre_run_cell', set_css)

import torch
import transformers
from transformers import AutoTokenizer, MistralForCausalLM

device = "cuda" if torch.cuda.is_available() else "cpu"

model = MistralForCausalLM.from_pretrained(
    "datatab/YugoGPT-Florida", 
    torch_dtype="auto"
).to(device)

tokenizer = AutoTokenizer.from_pretrained("datatab/YugoGPT-Florida")

from typing import Optional
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer


def generate(
    user_content: str, system_content: Optional[str] = ""
) -> str:
    system_content = """Ispod se nalazi uputstvo koje definiše zadatak, zajedno sa unosom koji pruža dodatni kontekst.
    Na osnovu ovih informacija, napišite odgovor koji precizno i tačno ispunjava zahtev.
    """

    messages = [
        {
            "role": "system",
            "content": system_content,
        },
        {"role": "user", "content": user_content},
    ]

    tokenized_chat = tokenizer.apply_chat_template(
        messages, tokenize=True, add_generation_prompt=True, return_tensors="pt"
    ).to("cuda")

    text_streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
    output = model.generate(
        tokenized_chat,
        streamer=text_streamer,
        max_new_tokens=2048,
        temperature=0.1,
        repetition_penalty=1.11,
        top_p=0.92,
        top_k=1000,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id,
        do_sample=True,
    )

    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

generate("Nabroj mi sve planete suncevog sistemai reci mi koja je najveca planeta?")

Sunčev sistem sadrži osam planeta: Merkur, Venera, Zemlja, Mars, Jupiter, Saturn, Uran i Neptun. Najveća planeta u Sunčevom sistemu je Jupiter.

💡 Contributions Welcome!

Have ideas, bug fixes, or want to add a custom model? We'd love for you to be part of the journey! Contributions help grow and enhance the capabilities of the YugoGPT-Florida.

📜 Citation

Thanks for using YugoGPT-Florida — where language learning models meet Serbian precision and creativity! Let's build smarter models together. 🚀�

If you find this model useful in your research, please cite it as follows:

@article{YugoGPT-Florida},
  title={YugoGPT-Florida},
  author={datatab},
  year={2024},
  url={https://huggingface.co/datatab/YugoGPT-Florida}
}

Downloads last month: 3

Safetensors

Model size

7B params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for datatab/YugoGPT-Florida

Base model

datatab/Yugo55A-GPT

Finetuned

(1)

this model

Quantizations

2 models

Datasets used to train datatab/YugoGPT-Florida

Collection including datatab/YugoGPT-Florida

Yugo-GPT

Collection

Yugo-GPT class of LLM (45, 55, 60) • 13 items • Updated Oct 15, 2024