Korean Emotion Classification (44 labels, KoELECTRA)

๐Ÿ“Œ ๊ฐœ์š”

์ด ๋ชจ๋ธ์€ ํ•œ๊ตญ์–ด ํ…์ŠคํŠธ์—์„œ 44๊ฐ€์ง€ ๊ฐ์ •(emotion) ์„ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์œ„ํ•ด ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
๋ฒ ์ด์Šค ๋ชจ๋ธ์€ monologg/koelectra-base-v3-discriminator ์ด๋ฉฐ,
KOTE ๋ฐ์ดํ„ฐ์…‹ + ์ถ”๊ฐ€ ์ˆ˜์ง‘ ๋ฐ์ดํ„ฐ๋กœ ํŒŒ์ธํŠœ๋‹ํ•˜์˜€์Šต๋‹ˆ๋‹ค.


๐Ÿงพ ๋ชจ๋ธ ์ •๋ณด

  • Base Model: KoELECTRA-base-v3-discriminator
  • Task: Multi-label emotion classification
  • Labels: 44๊ฐœ ๊ฐ์ •
  • Loss Function: Asymmetric Loss (ฮณโป=3)
  • Threshold: 0.6

๐ŸŽฏ ๊ฐ์ • ๋ผ๋ฒจ (์ด 44๊ฐœ)

๋ถˆํ‰/๋ถˆ๋งŒ, ํ™˜์˜/ํ˜ธ์˜, ๊ฐ๋™/๊ฐํƒ„, ์ง€๊ธ‹์ง€๊ธ‹, ๊ณ ๋งˆ์›€, ์Šฌํ””, ํ™”๋‚จ/๋ถ„๋…ธ, ์กด๊ฒฝ, ๊ธฐ๋Œ€๊ฐ, ์šฐ์ญ๋Œ/๋ฌด์‹œํ•จ, ์•ˆํƒ€๊นŒ์›€/์‹ค๋ง, ๋น„์žฅํ•จ, ์˜์‹ฌ/๋ถˆ์‹ , ๋ฟŒ๋“ฏํ•จ, ํŽธ์•ˆ/์พŒ์ , ์‹ ๊ธฐํ•จ/๊ด€์‹ฌ, ์•„๊ปด์ฃผ๋Š”, ๋ถ€๋„๋Ÿฌ์›€, ๊ณตํฌ/๋ฌด์„œ์›€, ์ ˆ๋ง, ํ•œ์‹ฌํ•จ, ์—ญ๊ฒจ์›€/์ง•๊ทธ๋Ÿฌ์›€, ์งœ์ฆ, ์–ด์ด์—†์Œ, ์—†์Œ, ํŒจ๋ฐฐ/์ž๊ธฐํ˜์˜ค, ๊ท€์ฐฎ์Œ, ํž˜๋“ฆ/์ง€์นจ, ์ฆ๊ฑฐ์›€/์‹ ๋‚จ, ๊นจ๋‹ฌ์Œ, ์ฃ„์ฑ…๊ฐ, ์ฆ์˜ค/ํ˜์˜ค, ํ๋ญ‡ํ•จ(๊ท€์—ฌ์›€/์˜ˆ์จ), ๋‹นํ™ฉ/๋‚œ์ฒ˜, ๊ฒฝ์•…, ๋ถ€๋‹ด/์•ˆ_๋‚ดํ‚ด, ์„œ๋Ÿฌ์›€, ์žฌ๋ฏธ์—†์Œ, ๋ถˆ์Œํ•จ/์—ฐ๋ฏผ, ๋†€๋žŒ, ํ–‰๋ณต, ๋ถˆ์•ˆ/๊ฑฑ์ •, ๊ธฐ์จ, ์•ˆ์‹ฌ/์‹ ๋ขฐ


๐Ÿ“Š ์„ฑ๋Šฅ (Validation ๊ธฐ์ค€)

  • Micro F1: ~0.62
  • Micro Precision: ~0.70
  • Micro Recall: ~0.55
  • Macro F1: ~0.47
  • Hamming Loss: ~0.12

๐Ÿš€ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import numpy as np

# ๋ชจ๋ธ ๋กœ๋“œ
model_name = "tobykim/koelectra-44emotions"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# ์ž…๋ ฅ ๋ฌธ์žฅ
text = "์˜ค๋Š˜ ๋„ˆ๋ฌด ๊ธฐ๋ถ„ ์ข‹์•„!"
inputs = tokenizer(text, return_tensors="pt")

# ์ถ”๋ก 
with torch.no_grad():
    logits = model(**inputs).logits
    probs = torch.sigmoid(logits).numpy()[0]

# ๊ฐ์ • ๋ผ๋ฒจ ๋งคํ•‘
LABELS = [
    '๋ถˆํ‰/๋ถˆ๋งŒ','ํ™˜์˜/ํ˜ธ์˜','๊ฐ๋™/๊ฐํƒ„','์ง€๊ธ‹์ง€๊ธ‹',
    '๊ณ ๋งˆ์›€','์Šฌํ””','ํ™”๋‚จ/๋ถ„๋…ธ','์กด๊ฒฝ','๊ธฐ๋Œ€๊ฐ',
    '์šฐ์ญ๋Œ/๋ฌด์‹œํ•จ','์•ˆํƒ€๊นŒ์›€/์‹ค๋ง','๋น„์žฅํ•จ',
    '์˜์‹ฌ/๋ถˆ์‹ ','๋ฟŒ๋“ฏํ•จ','ํŽธ์•ˆ/์พŒ์ ','์‹ ๊ธฐํ•จ/๊ด€์‹ฌ',
    '์•„๊ปด์ฃผ๋Š”','๋ถ€๋„๋Ÿฌ์›€','๊ณตํฌ/๋ฌด์„œ์›€','์ ˆ๋ง',
    'ํ•œ์‹ฌํ•จ','์—ญ๊ฒจ์›€/์ง•๊ทธ๋Ÿฌ์›€','์งœ์ฆ','์–ด์ด์—†์Œ',
    '์—†์Œ','ํŒจ๋ฐฐ/์ž๊ธฐํ˜์˜ค','๊ท€์ฐฎ์Œ','ํž˜๋“ฆ/์ง€์นจ',
    '์ฆ๊ฑฐ์›€/์‹ ๋‚จ','๊นจ๋‹ฌ์Œ','์ฃ„์ฑ…๊ฐ','์ฆ์˜ค/ํ˜์˜ค',
    'ํ๋ญ‡ํ•จ(๊ท€์—ฌ์›€/์˜ˆ์จ)','๋‹นํ™ฉ/๋‚œ์ฒ˜','๊ฒฝ์•…',
    '๋ถ€๋‹ด/์•ˆ_๋‚ดํ‚ด','์„œ๋Ÿฌ์›€','์žฌ๋ฏธ์—†์Œ','๋ถˆ์Œํ•จ/์—ฐ๋ฏผ',
    '๋†€๋žŒ','ํ–‰๋ณต','๋ถˆ์•ˆ/๊ฑฑ์ •','๊ธฐ์จ','์•ˆ์‹ฌ/์‹ ๋ขฐ'
]
threshold = 0.5
results = [(label, float(p)) for label, p in zip(LABELS, probs) if p > threshold]
print(sorted(results, key=lambda x: x[1], reverse=True))

--

# ๐Ÿท๏ธ ๋ผ์ด์„ ์Šค

Base model: KoELECTRA (MIT License)

Dataset: KOTE + ์ถ”๊ฐ€ ์ˆ˜์ง‘ ๋ฐ์ดํ„ฐ (๊ณต๊ฐœ ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜)

Model: ์ž์œ ๋กญ๊ฒŒ ์—ฐ๊ตฌ/ํ•™์Šต ๋ชฉ์  ์‚ฌ์šฉ ๊ฐ€๋Šฅ
Downloads last month
170
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for tobykim/koelectra-44emotions

Finetuned
(87)
this model

Dataset used to train tobykim/koelectra-44emotions