Fireworks Function Calling (FireFunction) Model V1

firefunction

FireFunction is a state-of-the-art function calling model with a commercially viable license. Key info and highlights:

๐Ÿ’ก The model is also hosted on the Fireworks platform. Offered for free during a limited beta period

โญ๏ธ Near GPT-4 level quality for real-world use cases of structured information generation and routing decision-making

๐Ÿ’จ Blazing fast speed. Inference speeds are roughly 4x that of GPT-4 when using FireFunction hosted on the Fireworks platform

๐Ÿ”„ Support for "any" parameter in tool_choice. Firefunction is the only model that we're aware that supports an option for the model to always choose a function - particularly helpful for routing use cases

โœ… The model is also API compatible with OpenAI function calling.

OPENAI_API_BASE=https://api.fireworks.ai/inference/v1
OPENAI_API_KEY=<YOUR_FIREWORKS_API_KEY>
MODEL=accounts/fireworks/models/firefunction-v1

Resources

Intended Use and Limitations

Primary Use

Although the model was trained on a variety of tasks, it performs best on:

  • single-turn request routing to a function picked from a pool of up to 20 function specs.
  • structured information extraction. See blog post for more info on FireFunction.

Out-of-Scope Use

The model was not optimized for the following use cases:

  • general multi-turn chat,
  • parallel and nested function calls in a single response. These can be broken into multiple messages.

Example Usage

See documentation for more detail.

from transformers import AutoModelForCausalLM, AutoTokenizer
import json

device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("fireworks-ai/firefunction-v1", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("fireworks-ai/firefunction-v1")

function_spec = [
    {
        "name": "get_stock_price",
        "description": "Get the current stock price",
        "parameters": {
            "type": "object",
            "properties": {
                "symbol": {
                    "type": "string",
                    "description": "The stock symbol, e.g. AAPL, GOOG"
                }
            },
            "required": [
                "symbol"
            ]
        }
    },
    {
        "name": "check_word_anagram",
        "description": "Check if two words are anagrams of each other",
        "parameters": {
            "type": "object",
            "properties": {
                "word1": {
                    "type": "string",
                    "description": "The first word"
                },
                "word2": {
                    "type": "string",
                    "description": "The second word"
                }
            },
            "required": [
                "word1",
                "word2"
            ]
        }
    }
]
functions = json.dumps(function_spec, indent=4)

messages = [
    {'role': 'functions', 'content': functions},
    {'role': 'system', 'content': 'You are a helpful assistant with access to functions. Use them if required.'},
    {'role': 'user', 'content': 'Hi, can you tell me the current stock price of AAPL?'}
]

model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)

generated_ids = model.generate(model_inputs, max_new_tokens=128)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])

Demo App

Check our easy-to-extend demo chat app with function calling capabilities built on Firefunction model.

Downloads last month
36
Safetensors
Model size
46.7B params
Tensor type
FP16
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.