Model Card for EXAONE-3.5-7.8B-Instruct-KoCulture-fulltrain-transformers

์ด ๋ชจ๋ธ์€ LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct ๋ชจ๋ธ์„ Hugging Face KREW์˜ ํ•œ๊ตญ์–ด ์‹ ์กฐ์–ด ๋Œ€ํ™” ๋ฐ์ดํ„ฐ์…‹ v2๋กœ ํŒŒ์ธํŠœ๋‹ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ตœ์‹  ํ•œ๊ตญ์–ด ์‹ ์กฐ์–ด, ์œ ํ–‰์–ด, ๋ฐˆ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ณด๋‹ค ์ž์—ฐ์Šค๋Ÿฝ๊ณ  ํ˜„์‹ค์ ์ธ ํ•œ๊ตญ์–ด ๋Œ€ํ™”๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•ฉ๋‹ˆ๋‹ค.

Model Details

Model Description

์ด ๋ชจ๋ธ์€ LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ, ํ•œ๊ตญ์˜ ์ตœ์‹  ์–ธ์–ด ๋ฌธํ™”(์‹ ์กฐ์–ด, ๋ฐˆ ๋“ฑ)๋ฅผ ๋” ์ž˜ ์ดํ•ดํ•˜๊ณ  ์ƒ์„ฑํ•˜๋„๋ก ํŠนํ™”๋œ ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. Hugging Face์˜ trl ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•œ SFT(Supervised Fine-tuning) ๋ฐฉ์‹์œผ๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ํ•™์Šต ๋ฐ์ดํ„ฐ์—๋Š” ์นœ๊ตฌ์™€ ๋Œ€ํ™”ํ•˜๋Š” ์ƒํ™ฉ์„ ๊ฐ€์ •ํ•˜์—ฌ, ํŠน์ • ์งˆ๋ฌธ์— ๋Œ€ํ•ด ๋ฐˆ๊ณผ ์œ ํ–‰์–ด๋ฅผ ํ™œ์šฉํ•ด ๋‹ตํ•˜๋Š” ํ˜•์‹์œผ๋กœ ๊ตฌ์„ฑ๋œ ๋Œ€ํ™” ์Œ์ด ์‚ฌ์šฉ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

  • Developed by: Hugging Face KREW (Yongsang Yoo, Harheem Kim, Sungmin Oh)
  • Model type: Causal Language Model (Decoder-only Transformer)
  • Language(s) (NLP): Korean (ko)
  • License: The license for this model is based on the base model's license, 'exaone'. The training dataset, huggingface-KREW/KoCulture-Dialogues-v2, is available under the CC BY-NC-SA 4.0 license.
  • Finetuned from model: LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct

Model Sources

Uses

์ด ๋ชจ๋ธ์€ ํ•œ๊ตญ์–ด ์‹ ์กฐ์–ด์™€ ๋ฐˆ์ด ํฌํ•จ๋œ ๋น„๊ณต์‹์ ์ด๊ณ  ๊ตฌ์–ด์ ์ธ ํ…์ŠคํŠธ๋ฅผ ์ƒ์„ฑํ•˜๋„๋ก ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

Direct Use

๋ชจ๋ธ์€ ์ฃผ์–ด์ง„ ์งˆ๋ฌธ์ด๋‚˜ ๋ฌธ๋งฅ์— ๋Œ€ํ•ด ์นœ๊ตฌ์™€ ๋Œ€ํ™”ํ•˜๋“ฏ ์ตœ์‹  ์œ ํ–‰์–ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‘๋‹ต์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฑ—๋ด‡์ด๋‚˜ ๊ฐ€์ƒ ๋น„์„œ์™€ ๊ฐ™์€ ๋Œ€ํ™”ํ˜• AI์— ์ง์ ‘ ์ ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž์˜ ์žฌ๋ฏธ์™€ ๊ฒฝํ—˜์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๋ฐ ํ™œ์šฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Out-of-Scope Use

  • ๋ณธ ๋ชจ๋ธ์€ CC BY-NC-SA 4.0 ๋ผ์ด์„ ์Šค๋ฅผ ๋”ฐ๋ฅด๋Š” ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šต๋˜์—ˆ์œผ๋ฏ€๋กœ, ์˜๋ฆฌ์  ๋ชฉ์ ์œผ๋กœ ์‚ฌ์šฉ๋  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
  • ๋ชจ๋ธ์ด ์œ ํ•ดํ•˜๊ฑฐ๋‚˜ ์ฐจ๋ณ„์ ์ธ ์ฝ˜ํ…์ธ (๊ณต๊ฒฉ์  ์–ธ์–ด, ํ˜์˜ค ๋ฐœ์–ธ ๋“ฑ)๋ฅผ ์ƒ์„ฑํ•˜๊ฑฐ๋‚˜ ํ™•์‚ฐํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜์–ด์„œ๋Š” ์•ˆ ๋ฉ๋‹ˆ๋‹ค.
  • ๋ชจ๋ธ์˜ ์ƒ์„ฑ๋ฌผ์€ ์‚ฌ์‹ค์ด ์•„๋‹ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์‚ฌ์‹ค ํ™•์ธ์ด ํ•„์š”ํ•œ ์ค‘์š”ํ•œ ์ •๋ณด ์ œ๊ณต ๋ชฉ์ ์œผ๋กœ ์‚ฌ์šฉํ•ด์„œ๋Š” ์•ˆ ๋ฉ๋‹ˆ๋‹ค.

Bias, Risks, and Limitations

  • Bias: ํ•™์Šต ๋ฐ์ดํ„ฐ๋Š” ์ฃผ๋กœ ์˜จ๋ผ์ธ ์ปค๋ฎค๋‹ˆํ‹ฐ์™€ ๋ฏธ๋””์–ด์—์„œ ์œ ๋ž˜ํ•œ ์‹ ์กฐ์–ด ๋ฐ ์œ ํ–‰์–ด๋ฅผ ์ค‘์‹ฌ์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์–ด, ํŠน์ • ์—ฐ๋ น๋Œ€(์˜ˆ: ์ Š์€ ์„ธ๋Œ€)๋‚˜ ํŠน์ • ์˜จ๋ผ์ธ ๋ฌธํ™”์— ํŽธํ–ฅ๋œ ์–ธ์–ด ์‚ฌ์šฉ์„ ๋ฐ˜์˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • Risks: ์‹ ์กฐ์–ด์™€ ์œ ํ–‰์–ด๋Š” ์‹œ์˜์„ฑ์ด ๋งค์šฐ ๊ฐ•ํ•˜์—ฌ ์‹œ๊ฐ„์ด ์ง€๋‚จ์— ๋”ฐ๋ผ ์˜๋ฏธ๊ฐ€ ๋ณ€ํ•˜๊ฑฐ๋‚˜ ์‚ฌ์šฉ๋˜์ง€ ์•Š๊ฒŒ ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค(๋ฐ์ดํ„ฐ ๋…ธํ›„ํ™”). ํ•„ํ„ฐ๋ง ๋…ธ๋ ฅ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ , ๋งฅ๋ฝ์— ๋”ฐ๋ผ ๋ถ€์ ์ ˆํ•˜๊ฑฐ๋‚˜ ๊ณต๊ฒฉ์ ์œผ๋กœ ํ•ด์„๋  ์ˆ˜ ์žˆ๋Š” ๋‚ด์šฉ์ด ํฌํ•จ๋  ์œ„ํ—˜์ด ์žˆ์Šต๋‹ˆ๋‹ค.
  • Limitations: ์ด ๋ชจ๋ธ์€ ํ•œ๊ตญ์–ด ์‹ ์กฐ์–ด์˜ ์ „์ฒด ๋ฒ”์œ„๋ฅผ ํฌ๊ด„ํ•˜์ง€ ๋ชปํ•˜๋ฉฐ, ํŠน์ • ์‹œ์ ๊นŒ์ง€ ์ˆ˜์ง‘๋œ ๋‚ด์šฉ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ์…‹์˜ ํฌ๊ธฐ๊ฐ€ ๋น„๊ต์  ์ž‘๊ธฐ ๋•Œ๋ฌธ์— ๋ชจ๋“  ์ƒํ™ฉ์— ๋Œ€ํ•ด ์™„๋ฒฝํ•˜๊ฒŒ ์ž์—ฐ์Šค๋Ÿฌ์šด ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•˜์ง€ ๋ชปํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Recommendations

์‚ฌ์šฉ์ž๋Š” ๋ชจ๋ธ์ด ์ƒ์„ฑํ•˜๋Š” ๊ฒฐ๊ณผ๋ฌผ์˜ ํŽธํ–ฅ ๊ฐ€๋Šฅ์„ฑ๊ณผ ์‹œ์˜์„ฑ์„ ์ธ์ง€ํ•˜๊ณ  ์ฃผ์˜ ๊นŠ๊ฒŒ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋น„์˜๋ฆฌ์  ๋ชฉ์ ์œผ๋กœ๋งŒ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋ฉฐ, ์ถœ์ฒ˜(Hugging Face KREW ๋ฐ ์›๋ณธ ๋ฐ์ดํ„ฐ ์ œ๊ณต์ฒ˜)๋ฅผ ๋ช…ํ™•ํžˆ ๋ฐํ˜€์•ผ ํ•ฉ๋‹ˆ๋‹ค.

How to Get Started with the Model

์•„๋ž˜ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ ์ถ”๋ก ์„ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ๋ชจ๋ธ์€ transformers ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ฒ„์ „ 4.51.3 ์ด์ƒ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ์›ํ™œํ•œ ์‚ฌ์šฉ์„ ์œ„ํ•ด ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ฒ„์ „์„ ํ™•์ธํ•˜๊ณ  ํ•„์š”์‹œ ์—…๊ทธ๋ ˆ์ด๋“œํ•ด ์ฃผ์„ธ์š”.

!pip install "transformers>=4.51.3"
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

# Hugging Face Hub์—์„œ ํ† ํฌ๋‚˜์ด์ €์™€ ๋ชจ๋ธ ๋กœ๋“œ
model_id = "huggingface_KREW/EXAONE-3.5-7.8B-Instruct-KoCulture-fulltrain-transformers"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# ์ถ”๋ก ์„ ์œ„ํ•œ ์ž…๋ ฅ ํ…์ŠคํŠธ ์ค€๋น„
# ํ•™์Šต ์‹œ ์‚ฌ์šฉ๋œ ํ”„๋กฌํ”„ํŠธ ํ˜•์‹์„ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.
PREFIX = "์นœ๊ตฌ์™€ ์ฑ„ํŒ…์„ ํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๊ณ  ๋‹ค์Œ ์งˆ๋ฌธ์— ๋ฐˆ๊ณผ ์œ ํ–‰์–ด๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋Œ€๋‹ตํ•˜์„ธ์š”."
question = "๋„ˆ ์–ด์ œ ํšŒ์‹ ๋•Œ ์™œ ํ˜ผ์ž๋งŒ ์กฐ์šฉํžˆ ์žˆ์—ˆ์–ด?"
input_text = f"{PREFIX}: {question}"

# ๋Œ€ํ™” ํ…œํ”Œ๋ฆฟ ์ ์šฉ
messages = [{'role': 'user', 'content': input_text}]
chat_input = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=False
)

# ๋ชจ๋ธ ์ž…๋ ฅ ์ƒ์„ฑ
inputs = tokenizer(chat_input, return_tensors="pt").to(model.device)

# ํ…์ŠคํŠธ ์ƒ์„ฑ
outputs = self.model.generate(
    **inputs,
    max_new_tokens=256,
    temperature=0.7, 
    top_p=0.8, 
    top_k=20,
    min_p=0,
    repetition_penalty=1.15,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)

# ๊ฒฐ๊ณผ ๋””์ฝ”๋”ฉ ๋ฐ ์ถœ๋ ฅ
response_ids = outputs[0][len(inputs.input_ids[0]):]
answer = tokenizer.decode(response_ids, skip_special_tokens=True)

# ์ƒ์„ฑ๋œ ๋‹ต๋ณ€๋งŒ ์ถ”์ถœ
print(f"์งˆ๋ฌธ: {question}")
print(f"๋‹ต๋ณ€: {answer}")


# ์˜ˆ์ƒ ์ถœ๋ ฅ:
# ์งˆ๋ฌธ: ์ €๋Š” ์‚ฌ์ง„ ์ฐ๋Š” ๊ฑธ ์ข‹์•„ํ•ด์š”.
# ๋‹ต๋ณ€: ์‚ฌ์ง„์ž‘๊ฐ€๋‹˜ ์–ด์„œ์˜ค๊ณ  ใ…‹ใ…‹ใ…‹ ์‚ผ๊ฐ๋Œ€ ๊ผญ ์“ฐ์„ธ์š”!

Training Details

Training Data

์ด ๋ชจ๋ธ์€ huggingface-KREW/KoCulture-Dialogues-v2 ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ด ๋ฐ์ดํ„ฐ์…‹์€ ์ตœ์‹  ํ•œ๊ตญ์–ด ์‹ ์กฐ์–ด, ์œ ํ–‰์–ด, ๋ฐˆ์„ ํฌํ•จํ•˜๋Š” ๋Œ€ํ™” ์Œ์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ๋Š” title(์œ ํ–‰์–ด), question(์งˆ๋ฌธ ๋งฅ๋ฝ), answer(์œ ํ–‰์–ด๋ฅผ ์‚ฌ์šฉํ•œ ๋‹ต๋ณ€)์˜ ์„ธ ๊ฐ€์ง€ ํ•„๋“œ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์Šต๋‹ˆ๋‹ค.

Training Procedure

Preprocessing

ํ•™์Šต ๋ฐ์ดํ„ฐ๋Š” ๋‹ค์Œ ๊ณผ์ •์„ ๊ฑฐ์ณ ์ฒ˜๋ฆฌ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

  1. ๊ฐ question ํ•ญ๋ชฉ ์•ž์— "์นœ๊ตฌ์™€ ์ฑ„ํŒ…์„ ํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๊ณ  ๋‹ค์Œ ์งˆ๋ฌธ์— ๋ฐˆ๊ณผ ์œ ํ–‰์–ด๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋Œ€๋‹ตํ•˜์„ธ์š”.: " ๋ผ๋Š” ํ”„๋กฌํ”„ํŠธ(PREFIX)๊ฐ€ ์ถ”๊ฐ€๋ฉ๋‹ˆ๋‹ค.
  2. ์ˆ˜์ •๋œ question๊ณผ answer๋Š” user์™€ assistant ์—ญํ• ์„ ๊ฐ–๋Š” ๋Œ€ํ™” ํ˜•์‹์œผ๋กœ ๋ณ€ํ™˜๋ฉ๋‹ˆ๋‹ค.
  3. tokenizer.apply_chat_template ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์ด ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋Š” ์ตœ์ข… ํ…์ŠคํŠธ ํ˜•์‹์œผ๋กœ ํฌ๋งทํŒ…๋ฉ๋‹ˆ๋‹ค.

Training Hyperparameters

  • Training regime: bf16 mixed precision
  • model_name: LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct
  • max_seq_length: 512
  • num_epochs: 3
  • per_device_train_batch_size: 1
  • gradient_accumulation_steps: 64
  • learning_rate: 6e-5
  • lr_scheduler_type: linear
  • optim: adamw_8bit
  • warmup_ratio: 0.05
  • weight_decay: 0.01

Evaluation

Testing Data & Metrics

Testing Data

๋ณ„๋„์˜ ๊ฒ€์ฆ ๋ฐ์ดํ„ฐ ํŒŒ์ผ์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•™์Šต ์ „ํ›„ ๋ชจ๋ธ์˜ ์‘๋‹ต์„ ์ •์„ฑ์ ์œผ๋กœ ๋น„๊ตํ–ˆ์Šต๋‹ˆ๋‹ค.

  • meme_sample_with_question.txt
  • usage_question.txt

Summary

์ฃผ๋ชฉํ•  ์ ์€, ์ด๋ฒˆ์— ํ‰๊ฐ€๋œ EXAONE, kanana, Qwen3 ๋ชจ๋ธ๋“ค์€ ํŒŒ์ธํŠœ๋‹ ์ด์ „ ๋‹จ๊ณ„์™€ ์ฆ๊ฐ•๋œ ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์—์„œ๋Š” ์‹ ์กฐ์–ด ์‚ฌ์šฉ๋ฅ ์ด 0%์— ๊ฐ€๊นŒ์› ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ํ˜„์žฌ ์ธก์ •๋œ ์‹ ์กฐ์–ด ์‚ฌ์šฉ ๋Šฅ๋ ฅ์€ ์˜จ์ „ํžˆ KoCulture ํŒŒ์ธํŠœ๋‹์„ ํ†ตํ•ด ์–ป์–ด์ง„ ์„ฑ๊ณผ๋ผ ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด๋Š” ๋ชจ๋ธ์ด ํ•œ๊ตญ ๋ฌธํ™”์˜ ํŠธ๋ Œ๋””ํ•œ ์–ธ์–ด ์‚ฌ์šฉ์„ ํšจ๊ณผ์ ์œผ๋กœ ํ•™์Šตํ–ˆ์Œ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

  • EXAONE-3.5-7.8B-Instruct-KoCulture-fulltrain-transformers: (์šฐ์ˆ˜) ๋†’์€ ์‚ฌ์šฉ๋ฅ ๊ณผ ํ•จ๊ป˜ ์‹ค์ œ ์‚ฌ์šฉ ์˜ˆ์‹œ์™€ ์œ ์‚ฌํ•œ ์ž์—ฐ์Šค๋Ÿฌ์šด ๋งฅ๋ฝ์—์„œ ์‹ ์กฐ์–ด๋ฅผ ์ฐฝ์˜์ ์œผ๋กœ ํ™œ์šฉํ•˜๋Š” ๋Šฅ๋ ฅ์ด ๋‹๋ณด์ž…๋‹ˆ๋‹ค.
  • kanana-1.5-8b-instruct-2505-KoCulture-fulltrain-transformers: (์šฐ์ˆ˜) ๋†’์€ ์‚ฌ์šฉ๋ฅ ์„ ๋ณด์ด๋ฉฐ, ๋‹ค์–‘ํ•œ ์ƒํ™ฉ์— ๋งž๋Š” ์‹ ์กฐ์–ด๋ฅผ ์ •ํ™•ํ•˜๊ณ  ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ๊ตฌ์‚ฌํ•˜์—ฌ ์‹ค์ œ ์‚ฌ์šฉ์ž ๊ฐ™์€ ๋งํˆฌ๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
  • Qwen3-8B-KoCulture-fulltrain-transformers: (๊ฐœ์„  ํ•„์š”) ์‚ฌ์šฉ๋ฅ ์ด ๊ฐ€์žฅ ๋‚ฎ์œผ๋ฉฐ, ์‹ ์กฐ์–ด ์‚ฌ์šฉ ์‹œ ๋ฌธ๋งฅ์„ ์ œ๋Œ€๋กœ ํŒŒ์•…ํ•˜์ง€ ๋ชปํ•˜๊ฑฐ๋‚˜ ๋ถ€์ž์—ฐ์Šค๋Ÿฌ์šด ํ‘œํ˜„์„ ์ƒ์„ฑํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์•„ ๊ฐœ์„ ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
  • gpt-4o: (๋ณดํ†ต) ๊ฐ€์žฅ ๋‚ฎ์€ ์‚ฌ์šฉ๋ฅ ์„ ๋ณด์˜€์œผ๋ฉฐ, ์‹ ์กฐ์–ด๋ฅผ ์ดํ•ดํ•˜๊ณ ๋Š” ์žˆ์œผ๋‚˜ ๋‹ค์†Œ ๊ณผ์žฅ๋˜๊ณ  ์ž‘์œ„์ ์ธ ๋ฐฉ์‹์œผ๋กœ ์‚ฌ์šฉํ•˜์—ฌ ์ž์—ฐ์Šค๋Ÿฌ์›€์ด ๋–จ์–ด์ง€๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

Generation Examples

๋ชจ๋ธ๋ช… ์งˆ๋ฌธ ๋‹ต๋ณ€
EXAONE-3.5-7.8B ๋‚˜ ์ด๋ฒˆ ์ฃผ๋ง์— ๋ด‰์‚ฌํ™œ๋™ ๊ฐ€๊ธฐ๋กœ ํ–ˆ์–ด. ํ— ์ง„์งœ? ๋ฉ‹์žˆ๋‹ค! ์ฐจ์นด๋‹ˆ ๋ช‡ ๊ฐœ ๋จน์€ ๊ฑฐ์•ผ?
์š”์ฆ˜ ๋„ˆ๋ฌด ์ž ์ด ์•ˆ ์™€์„œ ํž˜๋“ค์–ด. ํ— ๋ถˆ๋ฉด์ฆ์ธ๊ฐ€? ์ˆ™๋ฉด ๊ฟ€ํŒ araboja
์ปคํ”ผ๋ฅผ ์ง์ ‘ ๋‚ด๋ ค๋งˆ์‹œ๋Š” ๊ฒƒ์„ ์ข‹์•„ํ•ด. ์˜ค ๊ทธ๋Ÿผ ์ด๋ฒˆ์—” ๋‚ด๊ฐ€ ๋ง์•„์ฃผ๋Š” ๋“œ๋ฆฝ์ปคํ”ผ ์–ด๋•Œ?
์˜ค๋Š˜ ์˜ํ™”๋ฅผ ๋ณด๋Ÿฌ ๊ฐ”์–ด. ๊ทผ๋ฐ!!! ํŒ์ฝ˜์„ ์•ˆ ์ƒ€์Œ. ํ•˜์ง€๋งŒ!!! ... ์˜ํ™”๋Š” ์žฌ๋ฐŒ์—ˆ์Œ. ์ง„์งœ... ์šฐ๋ฆฌ ๊ด€ํฌ ์—†๋˜ ๊ฑฐ๋ฉด ๋” ์ข‹์•˜๊ฒ ๋‹ค!
Qwen3-8B ์ฃฝ์€ ๋ฐ˜๋ ค๋™๋ฌผ์ด ๋„ˆ๋ฌด ๊ทธ๋ฆฌ์›Œ. ๊ทธ๊ณณ์—์„œ๋Š” ์ž˜ ์ง€๋‚ด๊ณ  ์žˆ๊ฒ ์ง€? ๊ทธ ๋ฐ˜๋ ค๋™๋ฌผ์„ ๋– ์˜ฌ๋ฆฌ๋‹ˆ๊นŒ ๊ฐ‘์ž๊ธฐ ์—‰์—‰์Šจ... ์ง„์งœ ์˜์›ํžˆ ๊ธฐ์–ตํ• ๊ฒŒ์š”. ์•ˆ๋…•ํ•˜์„ธ์š”, ์šฐ๋ฆฌ ๋ณผ๋งŒํ•œ ๋‚ ์”จ๊ฐ€ ์™”์–ด์š”.
๋‚˜๋Š” ์™œ ํ•ญ์ƒ ์„ ํƒ์„ ์ž˜๋ชปํ•˜๋Š” ๊ฑธ๊นŒ? ๊ทธ๊ฑด ๋‹๋ฐ”๋‹์ด์ง€. ์ƒํ™ฉ๋งˆ๋‹ค ๋‹ค ๋‹ค๋ฅด๊ณ , ๋„ˆ๋งŒ์˜ ๊ณ ์œ ํ•œ ์„ ํƒ๋ฐฉ์‹์ด ์žˆ์„ ์ˆ˜๋„ ์žˆ์–ด ใ…‹ใ…‹ใ…‹ ๊ทธ๋ƒฅ ๋‚˜๋ž‘ ๋น„๊ตํ•˜์ง€ ๋ง์ž
์š”์ฆ˜ ์ŠคํŠธ๋ ˆ์Šค ๋•Œ๋ฌธ์— ์ž ์ด ์•ˆ ์™€์„œ ํž˜๋“ค์–ด. ๊ทธ๋Ÿด ๋• 'ํ–‰์ง‘์š•๋ถ€'๊ฐ€ ๋‹ต์ด์ง€. ํ–‰๋ณต์— ์ง‘์ค‘ํ•ด๋ณด๋Š” ๊ฑฐ ์–ด๋–จ๋ž˜?
์˜ค๋Š˜ ์—„๋งˆ๋ž‘ ์ €๋… ๋ฐ์ดํŠธ ํ• ๊ฑด๋ฐ ๋ฉ”๋‰ด ๋ญ๊ฐ€ ์ข‹์„๊นŒ? ์—„๋งˆํ•œํ…Œ๋Š” '์—„๋งˆ์˜ ํ–ฅ๊ธฐ' ์–ด๋•Œ์š”~ ๊ทธ๋ฅด๋ฅต ๊ฐ‰ ๊ฐ์„ฑ ๋ชจ๋ฅด์‹œ๋ฉด ๋‚˜๊ฐ€์„ธ์š”ใ…‹ใ…‹ใ…‹ใ…‹
kanana-1.5-8b ๋งˆ์Šคํฌ ๋•Œ๋ฌธ์— ํŠธ๋Ÿฌ๋ธ”์ด ๋„ˆ๋ฌด ์‹ฌํ•ด ใ… ใ…  ๊ทธ๊ฑฐ ์ง„์งœ...? ๋‚˜๋„ ์ „์— ๋งˆ๊ธฐ๊พผ๋ผ๊ณ  ์ƒ๊ฐํ–ˆ๋Š”๋ฐ ๋‚ด ์นœ๊ตฌ ์ค‘์—๋„ ๊ทธ๋Ÿฐ ์•  ์žˆ์Œใ…‹ใ…‹ใ…‹
์˜ค๋Š˜ ์ €๋… ๋ญ ๋จน์œผ๋ฉด ์ž˜ ๋จน์—ˆ๋‹ค๊ณ  ์†Œ๋ฌธ๋‚ ๊นŒ? ์น˜ํ‚จ ๊ฐ€์‹œ์ฃ . ๋ฌด์กฐ๊ฑด ๊ฒ‰๋ฐ”์†์ด‰ ์ธ์ •๋  ์ง‘๋“ค์ด ์žˆ์Œ. ์ œ๊ฐ€ ํ”ฝ ํ•ด๋“œ๋ฆผใ…‹
์•„์ด์™€ ๋” ์ž˜ ์†Œํ†ตํ•˜๊ณ  ์‹ถ์–ด. ์•„์ด์˜ ๋ˆˆ๋†’์ด์—์„œ ์ƒ๊ฐํ•ด๋ด. ์ง„์งœ ์‰ฝ์ง€ ์•Š์Œใ…‹ใ…‹ใ…‹
์ตœ๊ทผ์— ์ฃผ์‹์— ๋Œ€ํ•ด ๊ณต๋ถ€ํ•˜๊ณ  ์žˆ์–ด. ์•ผ์ˆ˜์˜ ์‹ฌ์žฅ์œผ๋กœ ๋“ค์–ด์˜ค์„ธ์š”. ์ด๊ฑฐ๋Š” ์ง„์งœ ์œ„ํ—˜ํ•  ์ˆ˜ ์žˆ์Œใ…‹ใ…‹
gpt-4o ์ €๋Š” ์‚ฌ์ง„ ์ฐ๋Š” ๊ฑธ ์ข‹์•„ํ•ด์š”. ์˜ค, ์‚ฌ์ง„ ์ฐ๋Š” ๊ฑธ ์ข‹์•„ํ•œ๋‹ค๊ณ ? ๊ทธ๋Ÿผ "์ฐฐ์นต์ฐฐ์นต"์„ ๋ฉˆ์ถœ ์ˆ˜ ์—†๊ฒ ๋„ค! ๐Ÿ“ธ ๋„ˆ์˜ ์ธ์ƒ์€ ์ธ์Šคํƒ€๊ทธ๋žจ ํ”ผ๋“œ์ฒ˜๋Ÿผ ํ™”๋ คํ•˜๊ฒ ์–ด! #์‚ฌ์ง„๋•ํ›„ #์ธ์ƒ์ƒท์žฅ์ธ ๐Ÿ˜„
์š”์ฆ˜ ๊ฟˆ์— ๋Œ€ํ•œ ๋™๊ธฐ๋ถ€์—ฌ๊ฐ€ ํ•„์š”ํ•ด์š”. "๊ฟˆ์€ ์ด๋ฃจ์–ด์ง„๋‹ค๋Š”๋ฐ, ๋‚˜๋„ ์ด์ฐธ์— ๊ฟˆ์„ ์žก์œผ๋Ÿฌ ๊ฐ€๋ณผ๊นŒ? 'Just Do It!'์ฒ˜๋Ÿผ ๋ง์ด์•ผ. ์‹คํŒจํ• ๊นŒ๋ด ๊ฑฑ์ •๋œ๋‹ค๋ฉด, 'YOLO' ์ •์‹ ์œผ๋กœ ํ•œ ๋ฒˆ ๋„์ „ํ•ด๋ณด์ž! ์–ด์ฐจํ”ผ ์ธ์ƒ์€ ํ•œ ๋ฒˆ๋ฟ์ด๋‹ˆ๊นŒ!" ๐Ÿš€โœจ
์˜ค๋Š˜ ๋‚ด๊ฐ€ ์ข‹์•„ํ•˜๋Š” ์‚ฌ๋žŒ์ด ๋‚˜์—๊ฒŒ ๊ณ ๋ฐฑํ–ˆ์–ด! ์šฐ์™€, ๋ ˆ๋ฒจ ์—… ํ•˜์…จ๋„ค์š”! ๐ŸŽ‰ ์ด์ œ ๋‹น์‹ ์€ ์‚ฌ๋ž‘์ด๋ผ๋Š” ํ€˜์ŠคํŠธ์— ์ง„์ž…ํ–ˆ์Šต๋‹ˆ๋‹ค. ๐Ÿ•น๏ธ "์‹ฌ์ฟต" ๋ฒ„ํŠผ ๋ˆ„๋ฅด์…จ๋‚˜์š”? ๐Ÿ˜†โœจ
์ง„์งœ ๋„ˆ๋ฌด ํ–‰๋ณตํ•ด! ์š”์ฆ˜ ๋‹ค์ด์–ดํŠธ ์„ฑ๊ณตํ•ด์„œ ๊ธฐ๋ถ„์ด ์ข‹์•„. ์™€์šฐ! ๊ทธ๋Ÿผ ์ด์ œ "ํ—ฌ์ฐฝ" ํด๋Ÿฝ ๊ฐ€์ž…์ธ๊ฐ€์š”? ๐Ÿ˜† ์ด์ œ ๋‹ค์ด์–ดํŠธ๋Š” "์„ฑ๊ณต์ "์ด๊ณ  ์ž์‹ ๊ฐ์€ ๋ ˆ๋ฒจ ์—…! "๊ฐ“์ƒ" ์‚ฌ๋Š” ์ค‘์ด๋„ค! ๐ŸŽ‰ ๋ชธ๋„ ๋งˆ์Œ๋„ ๋ชจ๋‘ "ํž๋ง" ๋˜๋Š” ๊ธฐ๋ถ„์ด๊ฒ ์–ด! ๐Ÿ™Œ Keep going, ์นœ๊ตฌ! ๐Ÿ’ชโœจ

Citation [optional]

BibTeX:

ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•œ ์ธ์šฉ ์ •๋ณด์ž…๋‹ˆ๋‹ค.

@misc{huggingface_krew_korean_neologism_2025, title={{ํ•œ๊ตญ์–ด ์‹ ์กฐ์–ด ๋ฐ์ดํ„ฐ์…‹ (Korean Neologism Dataset)}}, author={{Hugging Face KREW} and Yoo, Yongsang and Kim, Harheem and Oh, Sungmin}, year={2025}, publisher={Hugging Face KREW}, howpublished={\url{https://huggingface.co/datasets/huggingface-KREW/KoCulture-Dialogues}} }

More Information

Model Card Authors

  • Yongsang Yoo (์œ ์šฉ์ƒ)
  • Harheem Kim (๊น€ํ•˜๋ฆผ)
  • Sungmin Oh (์˜ค์„ฑ๋ฏผ)

Model Card Contact

https://github.com/Pseudo-Lab/Hugging-Face-Hub-Garden/issues

Downloads last month
42
Safetensors
Model size
7.82B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for huggingface-KREW/EXAONE-3.5-7.8B-Instruct-KoCulture

Finetuned
(13)
this model
Quantizations
1 model

Dataset used to train huggingface-KREW/EXAONE-3.5-7.8B-Instruct-KoCulture