Orbi-1B

Orbi-1B is a fine-tuned variant of TinyLlama-1.1B-Chat specialized for function calling and robotic assistant interactions. The model is trained to generate structured tool calls in response to natural language commands.

Model Description

Base Model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
Model Size: 1.1B parameters
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Precision: bfloat16
License: Apache 2.0 (inherited from base model)

Intended Use

Orbi-1B is designed to act as the "brain" of a robotic assistant named Orbi. It translates natural language user requests into structured JSON tool calls that can be executed by downstream systems.

Supported Tools

The model can generate calls for the following functions:

Physical Actions: smile(), cry(), move_hands(), dance()
Content Generation: tell_news(), tell_story()
Information: whats_your_name(), who_am_i()
Utilities: answer_arithmetic(), english_learning()

Usage

Installation

pip install transformers torch peft

Basic Inference

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_dir = "your-username/orbi-1b"
device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = AutoModelForCausalLM.from_pretrained(
    model_dir,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

system_prompt = """You are Orbi's brain.
Respond with one or more <tool_call> JSON blocks, in the exact order the user requests actions.
that calls the best tool for the user's request. Do not write stories yourself.
Do not summarize news yourself. Map synonyms to the tool argument enums.
If parameters are missing, pick sensible defaults. Keep outputs terse.

Available tools and enums:
- smile() -> {}
- cry() -> {}
- move_hands(direction ∈ {left,right,up,down,wave}, speed ∈ {slow,normal,fast})
- dance(style ∈ {hiphop,ballet,robot,random}, duration_sec ∈ [10..120])
- tell_news(topic: string)
- tell_story(topic: string, tone ∈ {wholesome,funny,dramatic,spooky,random}, length ∈ {short,medium,long})
"""

user_input = "Wave your hands quickly and smile"
prompt = f"<|system|>\n{system_prompt}\n<|user|>\n{user_input}\n<|assistant|>\n"

inputs = tokenizer(prompt, return_tensors="pt").to(device)
outputs = model.generate(
    **inputs,
    max_new_tokens=192,
    temperature=0.0,
    do_sample=False
)

response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=False)
print(response)

Expected Output Format

The model generates responses in the following format:

<tool_call>
{"name": "move_hands", "arguments": {"direction": "wave", "speed": "fast"}}
</tool_call>
<tool_call>
{"name": "smile", "arguments": {}}
</tool_call>

Parsing Tool Calls

import json
import re

def parse_tool_calls(text):
    pattern = r"<tool_call>\s*(\{.*?\})\s*</tool_call>"
    matches = re.findall(pattern, text, re.DOTALL)
    tools = []
    for match in matches:
        try:
            tools.append(json.loads(match))
        except:
            continue
    return tools

tools = parse_tool_calls(response)
print(tools)
# [{'name': 'move_hands', 'arguments': {'direction': 'wave', 'speed': 'fast'}},
#  {'name': 'smile', 'arguments': {}}]

Training Details

Training Data

The model was fine-tuned on a custom dataset of conversational examples mapping natural language commands to structured tool calls in JSONL format.

Training Procedure

Method: Supervised Fine-Tuning (SFT) with LoRA
LoRA Configuration:
- Rank (r): 16
- Alpha: 32
- Dropout: 0.05
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Quantization: 4-bit with NF4 quantization during training
Optimizer: Paged AdamW (32-bit)
Learning Rate: 2e-4 with cosine scheduling
Batch Size: 8 per device with 2 gradient accumulation steps
Epochs: 2
Warmup Ratio: 0.03

Limitations

The model is specialized for a specific set of tools and may not generalize well to arbitrary function calling tasks
Limited to 1.1B parameters, so reasoning capabilities are constrained compared to larger models
Best performance with greedy decoding (temperature=0.0)
Requires exact tool names and argument formats as specified in the system prompt

Ethical Considerations

This model is designed for robotic assistant applications. Users should:

Ensure appropriate safety measures when connecting to physical robotic systems
Validate all tool calls before execution
Implement proper error handling and fallback mechanisms
Consider privacy implications when using news/story generation features

Citation

@misc{orbi-1b,
  title={Orbi-1B: A Fine-tuned TinyLlama for Function Calling},
  author={Arojit Ghosh},
  year={2025},
  howpublished={\url{https://huggingface.co/Arojit/orbi-1b}}
}