DARE-TIES Merged Model (Ratio: 0.7)

This is a merged model created using the DARE_TIES method with mergekit.

Base Models

  • Qwen/Qwen2.5-Coder-7B-Instruct (Weight: 0.3)
  • lightblue/Karasu-DPO-7B (Weight: 0.7)

Merge Method

  • Method: DARE_TIES
  • Density: 0.5
  • Data Type: bfloat16

Purpose

This model aims to enhance Japanese code generation capabilities while maintaining English coding performance.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("noirchan/DARE-TIES-Qwen2.5-Coder-Karasu-0.7")
model = AutoModelForCausalLM.from_pretrained("noirchan/DARE-TIES-Qwen2.5-Coder-Karasu-0.7")

Evaluation

This model is part of a systematic evaluation of different merge ratios to find the optimal balance between Japanese language capabilities and code generation performance.

Downloads last month
9
Safetensors
Model size
7.62B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for noirchan/DARE-TIES-Qwen2.5-Coder-Karasu-0.7