gemma-2-9b-it-fix-system-role

Modified version of gemma-2-9b-it and update chat_template for support system role to handle cases:

  • Conversation roles must alternate user/assistant/user/assistant/...
  • System role not supported

Model Overview

  • Model Architecture: Gemma 2
    • Input: Text
    • Output: Text
  • Release Date: 04/12/2024
  • Version: 1.0

Deployment

Use with vLLM

This model can be deployed efficiently using the vLLM backend, as shown in the example below.

With CLI:

vllm serve --model dangvansam/gemma-2-9b-it-fix-system-role
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
  "model": "dangvansam/gemma-2-9b-it-fix-system-role",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Who are you?"}
  ]
}'

With Python:

from vllm import LLM, SamplingParams
from transformers import AutoTokenizer

model_id = "dangvansam/gemma-2-9b-it-fix-system-role"

sampling_params = SamplingParams(temperature=0.6, top_p=0.9, max_tokens=256)

tokenizer = AutoTokenizer.from_pretrained(model_id)

messages = [
  {"role": "system", "content": "You are helpfull assistant."},
  {"role": "user", "content": "Who are you?"}
]

prompts = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

llm = LLM(model=model_id)

outputs = llm.generate(prompts, sampling_params)

generated_text = outputs[0].outputs[0].text
print(generated_text)

vLLM also supports OpenAI-compatible serving. See the documentation for more details.

Downloads last month
5,884
Safetensors
Model size
9.24B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for dangvansam/gemma-2-9b-it-fix-system-role

Base model

google/gemma-2-9b
Finetuned
(212)
this model
Quantizations
2 models