Orbi-1B

Orbi-1B is a fine-tuned variant of TinyLlama-1.1B-Chat specialized for function calling and robotic assistant interactions. The model is trained to generate structured tool calls in response to natural language commands.

Model Description

  • Base Model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
  • Model Size: 1.1B parameters
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Precision: bfloat16
  • License: Apache 2.0 (inherited from base model)

Intended Use

Orbi-1B is designed to act as the "brain" of a robotic assistant named Orbi. It translates natural language user requests into structured JSON tool calls that can be executed by downstream systems.

Supported Tools

The model can generate calls for the following functions:

  • Physical Actions: smile(), cry(), move_hands(), dance()
  • Content Generation: tell_news(), tell_story()
  • Information: whats_your_name(), who_am_i()
  • Utilities: answer_arithmetic(), english_learning()

Usage

Installation

pip install transformers torch peft

Basic Inference

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_dir = "your-username/orbi-1b"
device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = AutoModelForCausalLM.from_pretrained(
    model_dir,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

system_prompt = """You are Orbi's brain.
Respond with one or more <tool_call> JSON blocks, in the exact order the user requests actions.
that calls the best tool for the user's request. Do not write stories yourself.
Do not summarize news yourself. Map synonyms to the tool argument enums.
If parameters are missing, pick sensible defaults. Keep outputs terse.

Available tools and enums:
- smile() -> {}
- cry() -> {}
- move_hands(direction โˆˆ {left,right,up,down,wave}, speed โˆˆ {slow,normal,fast})
- dance(style โˆˆ {hiphop,ballet,robot,random}, duration_sec โˆˆ [10..120])
- tell_news(topic: string)
- tell_story(topic: string, tone โˆˆ {wholesome,funny,dramatic,spooky,random}, length โˆˆ {short,medium,long})
"""

user_input = "Wave your hands quickly and smile"
prompt = f"<|system|>\n{system_prompt}\n<|user|>\n{user_input}\n<|assistant|>\n"

inputs = tokenizer(prompt, return_tensors="pt").to(device)
outputs = model.generate(
    **inputs,
    max_new_tokens=192,
    temperature=0.0,
    do_sample=False
)

response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=False)
print(response)

Expected Output Format

The model generates responses in the following format:

<tool_call>
{"name": "move_hands", "arguments": {"direction": "wave", "speed": "fast"}}
</tool_call>
<tool_call>
{"name": "smile", "arguments": {}}
</tool_call>

Parsing Tool Calls

import json
import re

def parse_tool_calls(text):
    pattern = r"<tool_call>\s*(\{.*?\})\s*</tool_call>"
    matches = re.findall(pattern, text, re.DOTALL)
    tools = []
    for match in matches:
        try:
            tools.append(json.loads(match))
        except:
            continue
    return tools

tools = parse_tool_calls(response)
print(tools)
# [{'name': 'move_hands', 'arguments': {'direction': 'wave', 'speed': 'fast'}},
#  {'name': 'smile', 'arguments': {}}]

Training Details

Training Data

The model was fine-tuned on a custom dataset of conversational examples mapping natural language commands to structured tool calls in JSONL format.

Training Procedure

  • Method: Supervised Fine-Tuning (SFT) with LoRA
  • LoRA Configuration:
    • Rank (r): 16
    • Alpha: 32
    • Dropout: 0.05
    • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Quantization: 4-bit with NF4 quantization during training
  • Optimizer: Paged AdamW (32-bit)
  • Learning Rate: 2e-4 with cosine scheduling
  • Batch Size: 8 per device with 2 gradient accumulation steps
  • Epochs: 2
  • Warmup Ratio: 0.03

Limitations

  • The model is specialized for a specific set of tools and may not generalize well to arbitrary function calling tasks
  • Limited to 1.1B parameters, so reasoning capabilities are constrained compared to larger models
  • Best performance with greedy decoding (temperature=0.0)
  • Requires exact tool names and argument formats as specified in the system prompt

Ethical Considerations

This model is designed for robotic assistant applications. Users should:

  • Ensure appropriate safety measures when connecting to physical robotic systems
  • Validate all tool calls before execution
  • Implement proper error handling and fallback mechanisms
  • Consider privacy implications when using news/story generation features

Citation

@misc{orbi-1b,
  title={Orbi-1B: A Fine-tuned TinyLlama for Function Calling},
  author={Arojit Ghosh},
  year={2025},
  howpublished={\url{https://huggingface.co/Arojit/orbi-1b}}
}

Acknowledgments

Built on top of TinyLlama-1.1B-Chat-v1.0 by the TinyLlama team.

Downloads last month
5
Safetensors
Model size
1B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Arojit/orbi-1b

Adapter
(1266)
this model
Adapters
1 model
Quantizations
1 model