HowRU-KoELECTRA-Emotion-Classifier
Model Description
KoELECTRA ๊ธฐ๋ฐ์ ํ๊ตญ์ด(ํนํ ์ผ๊ธฐ/์ฌ๋ฆฌ ๊ธฐ๋ก) ๊ฐ์ ๋ถ๋ฅ ๋ชจ๋ธ์
๋๋ค.
ํ
์คํธ์์ 8๊ฐ์ง ๊ฐ์ (๊ธฐ์จ, ์ค๋ , ํ๋ฒํจ, ๋๋ผ์, ๋ถ์พํจ, ๋๋ ค์, ์ฌํ, ๋ถ๋
ธ)์ ์ธ์ํฉ๋๋ค.
- Model type: Text Classification (Emotion Recognition)
- Language: Korean (ํ๊ตญ์ด, ko)
- License: MIT
- Finetuned from model: monologg/koelectra-base-v3-discriminator
Emotion Classes
์ด ๋ชจ๋ธ์ ์ ๋ ฅ๋ ํ๊ตญ์ด ๋ฌธ์ฅ์ ์ฃผ์ ๊ฐ์ ์ ์๋ 8๊ฐ ํด๋์ค ์ค ํ๋๋ก ๋ถ๋ฅํฉ๋๋ค.
| Emotion (Korean) | Emotion (EN) |
|---|---|
| ๊ธฐ์จ | Joy |
| ์ค๋ | Excitement |
| ํ๋ฒํจ | Neutral |
| ๋๋ผ์ | Surprise |
| ๋ถ์พํจ | Disgust |
| ๋๋ ค์ | Fear |
| ์ฌํ | Sadness |
| ๋ถ๋ ธ | Anger |
How to Get Started with the Model
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import torch.nn.functional as F
# 1) Load Model & Tokenizer
MODEL_NAME = "LimYeri/HowRU-KoELECTRA-Emotion-Classifier"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForSequenceClassification.from_pretrained(MODEL_NAME)
# GPU ์ฌ์ฉ ๊ฐ๋ฅ ์ ์๋ ์ ํ
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()
# ๊ฐ์ ๋ผ๋ฒจ ๋งคํ (id2label)
id2label = model.config.id2label
# 2) Inference Function
def predict_emotion(text: str):
"""
Returns:
- top1_pred: ์์ธก๋ ๊ฐ์ ๋ผ๋ฒจ
- probs_sorted: ๊ฐ์ ๋ณ ํ๋ฅ (๋ด๋ฆผ์ฐจ์)
- top2_pred: ์์ ๋ ๊ฐ์ ๊ฐ์
"""
# ํ ํฌ๋์ด์ง
inputs = tokenizer(
text,
return_tensors="pt",
truncation=True,
padding=True,
max_length=512
).to(device)
# ์ถ๋ก
with torch.no_grad():
logits = model(**inputs).logits
probs = F.softmax(logits, dim=-1)[0]
# ์ ๋ ฌ๋ ํ๋ฅ
probs_sorted = sorted(
[(id2label[i], float(probs[i])) for i in range(len(probs))],
key=lambda x: x[1],
reverse=True
)
top1_pred = probs_sorted[0]
top2_pred = probs_sorted[:2]
return {
"text": text,
"top1_emotion": top1_pred,
"top2_emotions": top2_pred,
"all_probabilities": probs_sorted,
}
# 3) Example
result = predict_emotion("์ค๋ ์ ๋ง ๊ธฐ๋ถ์ด ์ข๊ณ ํ๋ณตํ ํ๋ฃจ์์ด!")
print(result)
pipeline
from transformers import pipeline
MODEL_NAME = "LimYeri/HowRU-KoELECTRA-Emotion-Classifier"
classifier = pipeline(
"text-classification",
model=MODEL_NAME,
tokenizer=MODEL_NAME,
top_k=None # ์ ์ฒด ๊ฐ์ ํ๋ฅ ๋ฐํ
)
# ์์ธก
text = "์ค๋ ์ ๋ง ๊ธฐ๋ถ์ด ์ข๊ณ ํ๋ณตํ ํ๋ฃจ์์ด!"
result = classifier(text)
result = result[0]
print("์
๋ ฅ ๋ฌธ์ฅ:", text)
print("\nTop-1 ๊ฐ์ :", result[0]['label'], f"({result[0]['score']:.4f})")
print("\n์ ์ฒด ๊ฐ์ ๋ถํฌ:")
for r in result:
print(f" {r['label']}: {r['score']:.4f}")
Training Details
Training Data
- Total(8:2๋ก ๋ถํ ): 50,000ํ
- Train: 40,000ํ
- Validation: 10,000ํ
Training Procedure
- Base Model: monologg/koelectra-base-v3-discriminator
- Objective: Single-label classification
- Max Length: 512
Training Hyperparameters
- num_train_epochs: 3
- learning_rate: 3e-5
- weight_decay: 0.02
- warmup_ratio: 0.15
- per_device_train_batch_size: 32
- per_device_eval_batch_size: 64
- max_grad_norm: 1.0
Performance
| Metric | Score |
|---|---|
| Eval Accuracy | 0.95 |
| Eval F1 Macro | 0.95 |
| Eval Loss | 0.16 |
Model Architecture
1) ELECTRA Encoder (Base-size)
- Hidden size: 768
- Layers: 12 Transformer blocks
- Attention heads: 12
- MLP intermediate size: 3072
- Activation: GELU
- Dropout: 0.1
2) Classification Head
๊ฐ์ 8๊ฐ ํด๋์ค๋ฅผ ์์ธกํ๊ธฐ ์ํ ์ถ๊ฐ ๋ถ๋ฅ ํค๋:
- Dense Layer: 768 โ 768
- Activation: GELU
- Dropout: 0.1
- Output Projection: 768 โ 8
Citation
@misc{HowRUEmotion2025,
title={HowRU KoELECTRA Emotion Classifier},
author={Lim, Yeri},
year={2025},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/LimYeri/HowRU-KoELECTRA-Emotion-Classifier}}
}
- Downloads last month
- 35
Model tree for LimYeri/HowRU-KoELECTRA-Emotion-Classifier
Base model
monologg/koelectra-base-v3-discriminator