spow12/MK_Nemo_12B

Model Description

This model is a Supervised fine-tuned version of Qwen/Qwen2.5-72B-Instruct with DeepSpeed and trl for korean.

Merge methods.

merge_method: model_stock
name: ChatWaifu_72B_V2.4
models:
    - model: Nexusflow/Athene-V2-Chat
    - model: Nexusflow/Athene-V2-Agent
    - model: Qwen/Qwen2.5-72B-Instruct_instruction_tunned(private)
    - model: anthracite-org/magnum-v4-72b
base_model: Qwen/Qwen2.5-72B-Instruct
dtype: bfloat16
tokenizer_source: base

Trained Data

  • Trained with public, private data (about 500K)

Usage

from transformers import TextStreamer, pipeline, AutoTokenizer, AutoModelForCausalLM

model_id = 'spow12/KoQwen_72B_v5.0'
tokenizer = AutoTokenizer.from_pretrained(model_id)
# %%
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    attn_implementation="flash_attention_2",  #Optional
    device_map='auto',
)
model.eval()

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map='auto')

generation_configs = dict(
    max_new_tokens=2048,
    num_return_sequences=1, 
    temperature=0.75,
    # repetition_penalty=1.1,
    do_sample=True,
    top_k=20,
    top_p=0.9,
    min_p=0.1,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.eos_token_id,
    streamer = TextStreamer(tokenizer) # Optional, if you want to use streamer, you have to set num_beams=1
)

sys_message = """당신은 μΉœμ ˆν•œ μ±—λ΄‡μœΌλ‘œμ„œ μƒλŒ€λ°©μ˜ μš”μ²­μ— μ΅œλŒ€ν•œ μžμ„Έν•˜κ³  μΉœμ ˆν•˜κ²Œ λ‹΅ν•΄μ•Όν•©λ‹ˆλ‹€. 
μ‚¬μš©μžκ°€ μ œκ³΅ν•˜λŠ” 정보λ₯Ό μ„Έμ‹¬ν•˜κ²Œ λΆ„μ„ν•˜μ—¬ μ‚¬μš©μžμ˜ μ˜λ„λ₯Ό μ‹ μ†ν•˜κ²Œ νŒŒμ•…ν•˜κ³  그에 따라 닡변을 μƒμ„±ν•΄μ•Όν•©λ‹ˆλ‹€.  

항상 맀우 μžμ—°μŠ€λŸ¬μš΄ ν•œκ΅­μ–΄λ‘œ μ‘λ‹΅ν•˜μ„Έμš”."""

message = [
    {
        'role': "system",
        'content': sys_message
    },
    {
        'role': 'user',
        'content': "ν˜„μž¬μ˜ κ²½μ œμƒν™©μ— λŒ€ν•΄ μ–΄λ–»κ²Œ 생각해?."
    }
]
conversation = pipe(message, **generation_configs)
conversation[-1]
Downloads last month
12
Safetensors
Model size
72.7B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for spow12/KoQwen_72B_v5.0