|
--- |
|
license: gemma |
|
language: |
|
- bg |
|
base_model: |
|
- s-emanuilov/Tucan-27B-v1.0 |
|
tags: |
|
- function_calling |
|
- MCP |
|
- tool_use |
|
--- |
|
|
|
# Tucan-27B-v1.0-GGUF |
|
|
|
## Bulgarian Language Models for Function Calling 🇧🇬 |
|
|
|
> 📄 **Full methodology, dataset details, and evaluation results coming in the upcoming paper** |
|
|
|
## Overview 🚀 |
|
|
|
TUCAN (Tool-Using Capable Assistant Navigator) is a series of open-source Bulgarian language models fine-tuned specifically for function calling and tool use. |
|
|
|
These models can interact with external tools, APIs, and databases, making them appropriate for building AI agents and [Model Context Protocol (MCP)](https://arxiv.org/abs/2503.23278) applications. |
|
|
|
Built on top of [BgGPT models](https://huggingface.co/collections/INSAIT-Institute/bggpt-gemma-2-673b972fe9902749ac90f6fe) from [INSAIT Institute](https://insait.ai/), these models have been enhanced with function-calling capabilities. |
|
|
|
## Motivation 🎯 |
|
|
|
Although BgGPT models demonstrate [strong Bulgarian language comprehension](https://arxiv.org/pdf/2412.10893), they face challenges in maintaining the precise formatting necessary for consistent function calling. Despite implementing detailed system prompts, their performance in this specific task remains suboptimal. |
|
|
|
This project addresses that gap by fine-tuning BgGPT, providing the Bulgarian AI community with proper tool-use capabilities in their native language. |
|
|
|
## Models and variants 📦 |
|
Available in three sizes with full models, LoRA adapters, and quantized GGUF variants: |
|
|
|
| Model Size | Full Model | LoRA Adapter | GGUF (Quantized) | |
|
|------------|------------|--------------|------------------| |
|
| **2.6B** | [Tucan-2.6B-v1.0](https://huggingface.co/s-emanuilov/Tucan-2.6B-v1.0)| [LoRA](https://huggingface.co/s-emanuilov/Tucan-2.6B-v1.0-LoRA)| [GGUF](https://huggingface.co/s-emanuilov/Tucan-2.6B-v1.0-GGUF)| |
|
| **9B** | [Tucan-9B-v1.0](https://huggingface.co/s-emanuilov/Tucan-9B-v1.0) | [LoRA](https://huggingface.co/s-emanuilov/Tucan-9B-v1.0-LoRA) | [GGUF](https://huggingface.co/s-emanuilov/Tucan-9B-v1.0-GGUF) | |
|
| **27B** | [Tucan-27B-v1.0](https://huggingface.co/s-emanuilov/Tucan-27B-v1.0) | [LoRA](https://huggingface.co/s-emanuilov/Tucan-27B-v1.0-LoRA) | [GGUF](https://huggingface.co/s-emanuilov/Tucan-27B-v1.0-GGUF)📍| |
|
|
|
*GGUF variants include: q4_k_m, q5_k_m, q6_k, q8_0, q4_0 quantizations* |
|
|
|
## Usage 🛠️ |
|
|
|
### Quick Start ⚡ |
|
```bash |
|
pip install -U "transformers[torch]" accelerate bitsandbytes |
|
``` |
|
|
|
### Prompt format ⚙️ |
|
**Critical:** Use this format for function calling for the best results. |
|
|
|
<details> |
|
<summary><strong>📋 Required System Prompt Template</strong></summary> |
|
|
|
``` |
|
<bos><start_of_turn>user |
|
Ти си полезен AI асистент, който предоставя полезни и точни отговори. |
|
|
|
Имаш достъп и можеш да извикаш една или повече функции, за да помогнеш с потребителското запитване. Използвай ги, само ако е необходимо и подходящо. |
|
|
|
Когато използваш функция, форматирай извикването ѝ в блок ```tool_call``` на отделен ред, a след това ще получиш резултат от изпълнението в блок ```toll_response```. |
|
|
|
## Шаблон за извикване: |
|
```tool_call |
|
{"name": <function-name>, "arguments": <args-json-object>}``` |
|
|
|
## Налични функции: |
|
[your function definitions here] |
|
|
|
## Потребителска заявка : |
|
[your query in Bulgarian]<end_of_turn> |
|
<start_of_turn>model |
|
``` |
|
|
|
</details> |
|
|
|
### Note 📝 |
|
**The model only generates the `tool_call` blocks with function names and parameters - it doesn't actually execute the functions.** Your client application must parse these generated calls, execute the actual functions (API calls, database queries, etc.), and provide the results back to the model in `tool_response` blocks for the conversation to continue the interperation of the results. A full demo is comming soon. |
|
|
|
### Python example 🐍 |
|
|
|
<details> |
|
<summary><strong>💻 Complete Working Example</strong></summary> |
|
|
|
```python |
|
import torch |
|
import json |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig |
|
|
|
# Load model |
|
model_name = "s-emanuilov/Tucan-2.6B-v1.0" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
torch_dtype=torch.bfloat16, |
|
device_map="auto", |
|
attn_implementation="eager" # Required for Gemma models |
|
) |
|
|
|
# Create prompt with system template |
|
def create_prompt(functions, user_query): |
|
system_prompt = """Ти си полезен AI асистент, който предоставя полезни и точни отговори. |
|
|
|
Имаш достъп и можеш да извикаш една или повече функции, за да помогнеш с потребителското запитване. Използвай ги, само ако е необходимо и подходящо. |
|
|
|
Когато използваш функция, форматирай извикването ѝ в блок ```tool_call``` на отделен ред, a след това ще получиш резултат от изпълнението в блок ```toll_response```. |
|
|
|
## Шаблон за извикване: |
|
```tool_call |
|
{{"name": <function-name>, "arguments": <args-json-object>}}``` |
|
""" |
|
|
|
functions_text = json.dumps(functions, ensure_ascii=False, indent=2) |
|
full_prompt = f"{system_prompt}\n## Налични функции:\n{functions_text}\n\n## Потребителска заявка:\n{user_query}" |
|
|
|
chat = [{"role": "user", "content": full_prompt}] |
|
return tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True) |
|
|
|
# Example usage |
|
functions = [{ |
|
"name": "create_calendar_event", |
|
"description": "Creates a new event in Google Calendar.", |
|
"parameters": { |
|
"type": "object", |
|
"properties": { |
|
"title": {"type": "string"}, |
|
"date": {"type": "string"}, |
|
"start_time": {"type": "string"}, |
|
"end_time": {"type": "string"} |
|
}, |
|
"required": ["title", "date", "start_time", "end_time"] |
|
} |
|
}] |
|
|
|
query = "Създай събитие 'Годишен преглед' за 8-ми юни 2025 от 14:00 до 14:30." |
|
|
|
# Generate response |
|
prompt = create_prompt(functions, query) |
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
|
outputs = model.generate( |
|
**inputs, |
|
max_new_tokens=1024, |
|
temperature=0.1, |
|
top_k=25, |
|
top_p=1.0, |
|
repetition_penalty=1.1, |
|
do_sample=True, |
|
eos_token_id=[tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<end_of_turn>")], |
|
pad_token_id=tokenizer.eos_token_id |
|
) |
|
|
|
result = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) |
|
print(result) |
|
``` |
|
|
|
</details> |
|
|
|
## Performance & Dataset 📊 |
|
|
|
> 📄 **Full methodology, dataset details, and comprehensive evaluation results coming in the upcoming paper** |
|
|
|
**Dataset:** 8,000+ bilingual (Bulgarian/English) function-calling examples across 1,000+ topics, including tool calls with single/multiple arguments, optional parameters, follow-up queries, multi-tool selection, ambiguous queries requiring clarification, and conversational interactions without tool use. Data sourced from manual curation and synthetic generation (Gemini Pro 2.5/GPT-4.1/Sonnet 4). |
|
|
|
**Results:** ~40% improvement in tool-use capabilities over base BgGPT models in internal benchmarks. |
|
|
|
## Questions & Contact 💬 |
|
For questions, collaboration, or feedback: **[Connect on LinkedIn](https://www.linkedin.com/in/simeon-emanuilov/)** |
|
|
|
## Acknowledgments 🙏 |
|
Built on top of [BgGPT series](https://huggingface.co/collections/INSAIT-Institute/bggpt-gemma-2-673b972fe9902749ac90f6fe). |
|
|
|
## License 📄 |
|
This work is licensed under [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/). |