Dracarys-72B-Instruct

Introduction

We introduce the latest in the Smaug series, the Dracarys family of finetunes targeting coding performance improvements across a variety of base models.

This variant is a finetune of Qwen2-72B-Instruct

Compared to Qwen2-72B-Instruct, Dracarys has better LiveCodeBench scores (see evaluation results below).

Model Description

How to use

The prompt format is unchanged from Qwen2-72B-Instruct (see evaluations for prompt details for LCB)

Use with transformers

See the snippet below for usage with Transformers:

import transformers
import torch

model_id = "abacusai/Dracarys-72B-Instruct"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are data science coding assistant that generates Python code using Pandas and Numpy."},
    {"role": "user", "content": "Write code to select rows from the dataframe `df` having the maximum `temp` for each `city`"},
]

prompt = pipeline.tokenizer.apply_chat_template(
        messages, 
        tokenize=False, 
        add_generation_prompt=True
)

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    prompt,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][len(prompt):])

Evaluation Results

LiveCodeBench

Model Code Generation Code Execution Test Output Prediction
Dracarys-72B-Instruct 33.57 62.96 58.93
Qwen2-72B-Instruct 32.92 58.95 55.88

Breakdown of LiveCodeBench CodeGeneration

Model Easy Medium Hard
Dracarys-72B-Instruct 64.16 25.06 3.64
Qwen2-72B-Instruct 65.83 22.28 3.11

Breakdown of LiveCodeBench TestOutputPrediction

Model Easy Medium Hard
Dracarys-72B-Instruct 65.37 58.74 46.38
Qwen2-72B-Instruct 63.19 54.08 46.52

LiveBench (July update)

Model Global Average Coding Average Language Average Mathematics Average Data Analysis Average Reasoning Average IF Average
Dracarys-72B-Instruct 41.20 38.95 31.17 42.77 26.24 40 68.08
Qwen2-72B-Instruct 40.15 32.38 29.21 43.44 26.24 41.33 68.27
Downloads last month
178
Safetensors
Model size
72.7B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for abacusai/Dracarys-72B-Instruct

Quantizations
2 models