ํ•œ๊ตญ์–ด ๊ต์œก ์ž๋ฃŒ ํŒŒ์ธํŠœ๋‹ ๋ชจ๋ธ (Qwen2.5-1.5B + LoRA)

๋ชจ๋ธ ์†Œ๊ฐœ

Qwen/Qwen2.5-1.5B-Instruct ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ maywell/korean_textbooks ๋ฐ์ดํ„ฐ์…‹, ๊ทธ๋ฆฌ๊ณ  LoRA(์ €๋žญํฌ ์ ์‘) ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•ด ํŒŒ์ธํŠœ๋‹ํ•œ ์–ด๋Œ‘ํ„ฐ(LoRA ๊ฐ€์ค‘์น˜) ์ž…๋‹ˆ๋‹ค. ๋ฒ ์ด์Šค ๊ฐ€์ค‘์น˜๋Š” ํฌํ•จ๋˜์ง€ ์•Š์œผ๋ฉฐ, ๋ฒ ์ด์Šค + ์–ด๋Œ‘ํ„ฐ๋กœ ๋กœ๋“œํ•˜์—ฌ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

  • ํ•™์Šต ๋ฐฉ์‹: LoRA (QLoRA, 4bit ๋กœ๋”ฉ)
  • ์ฃผ์š” ๋ชฉ์ : ํ•œ๊ตญ์–ด ๊ต์œก/์„ค๋ช…ํ˜• ์‘๋‹ต ํ’ˆ์งˆ ํ–ฅ์ƒ

์ฐธ๊ณ : ํ•™์Šต์—๋Š” Unsloth/TRL/PEFT ์Šคํƒ์„ ์‚ฌ์šฉํ–ˆ๊ณ , ์ถ”๋ก ์€ HF Transformers + PEFT๋งŒ์œผ๋กœ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

์‚ฌ์šฉ ๋ฐฉ๋ฒ•

1) ๋ชจ๋ธ ๋กœ๋“œ(4bit + PEFT)

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

BASE = "Qwen/Qwen2.5-1.5B-Instruct"
ADAPTER = "Eunma/korean-model"

tokenizer = AutoTokenizer.from_pretrained(BASE)
base = AutoModelForCausalLM.from_pretrained(
    BASE,
    load_in_4bit=True,
    device_map="auto",
    trust_remote_code=True
)

model = PeftModel.from_pretrained(base, ADAPTER)
model.eval()

messages = [
    { "role": "system", "content": "ํ•œ๊ตญ์–ด๋กœ ์ •ํ™•ํ•˜๊ณ  ์นœ์ ˆํ•˜๊ฒŒ ์„ค๋ช…ํ•˜๋Š” ๊ต์œก ๋„์šฐ๋ฏธ์ž…๋‹ˆ๋‹ค." },
    { "role": "user", "content": "2์˜ ๊ฑฐ๋“ญ์ œ๊ณฑ์— ๋Œ€ํ•ด ๊ฐ„๋‹จํžˆ ์„ค๋ช…ํ•ด์ค˜." },
]

prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
enc = tokenizer(prompt, return_tensors="pt").to(model.device)
if "attention_mask" not in enc:
    enc["attention_mask"] = torch.ones_like(enc["input_ids"])

with torch.inference_mode():
    out = model.generate(
        **enc,
        max_new_tokens=256,
        do_sample=True, temperature=0.7, top_p=0.9,
        pad_token_id=tokenizer.eos_token_id,
        use_cache=True
    )

print(tokenizer.decode(out[0], skip_special_tokens=True))

ํ›ˆ๋ จ ์ •๋ณด

  • ๋ฒ ์ด์Šค ๋ชจ๋ธ: Qwen/Qwen2.5-1.5B
  • ํ›ˆ๋ จ ์Šคํ…: 30 steps
  • ์˜ตํ‹ฐ๋งˆ์ด์ €: adamw_8bit
  • ์Šค์ผ€์ค„๋Ÿฌ: linear
  • LoRA ์„ค์ •: r=8, alpha=16
  • ํƒ€๊ฒŸ ๋ชจ๋“ˆ: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • ๋ฐ์ดํ„ฐ์…‹: maywell/korean_textbooks

์‹œ์Šคํ…œ ์š”๊ตฌ์‚ฌํ•ญ

  • GPU ๋ฉ”๋ชจ๋ฆฌ: ์ตœ์†Œ 6GB (๊ถŒ์žฅ 8GB+)
  • ํ•™์Šต(QLoRA, 4bit): GPU 12โ€“16GB ๊ถŒ์žฅ(T4 16GB์—์„œ ํ™•์ธ)
  • Python: 3.10+
  • ์ฃผ์š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ: transformers, peft, torch, bitsandbytes, accelerate

์ฃผ์˜์‚ฌํ•ญ

  1. ํ•œ๊ตญ์–ด ์ค‘์‹ฌ์œผ๋กœ ํŠœ๋‹. ํƒ€ ์–ธ์–ด ์‘๋‹ต ํ’ˆ์งˆ์€ ์ œํ•œ์ ์ผ ์ˆ˜ ์žˆ์Œ.
  2. ๋ฒ ์ด์Šค ๋ผ์ด์„ ์Šค ๋ฐ ์‚ฌ์šฉ ์ •์ฑ… ์ค€์ˆ˜
  3. ์–ด๋Œ‘ํ„ฐ๋งŒ ํฌํ•จ๋˜์–ด ์žˆ์œผ๋ฏ€๋กœ ๋ฒ ์ด์Šค ๋ชจ๋ธ๊ณผ ํ•จ๊ป˜ ๋กœ๋“œ
  4. ์‚ฌ์‹ค์„ฑ ๊ฒ€์ฆ ํ•„์š”.

๊ด€๋ จ ๋งํฌ

๐Ÿ“œ ๋ผ์ด์„ ์Šค

์ด ๋ชจ๋ธ์€ ๋ฒ ์ด์Šค ๋ชจ๋ธ์ธ Qwen2.5-1.5B์˜ ๋ผ์ด์„ ์Šค๋ฅผ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.

Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Eunma/korean-model

Base model

Qwen/Qwen2.5-1.5B
Adapter
(490)
this model

Dataset used to train Eunma/korean-model