File size: 1,829 Bytes
9884bb7
 
 
 
bc8d93d
9884bb7
 
 
 
 
 
 
 
bc8d93d
9884bb7
 
 
bc8d93d
9884bb7
 
 
 
 
bc8d93d
9884bb7
bc8d93d
9884bb7
 
8b54839
9884bb7
bc8d93d
9884bb7
 
 
 
 
bc8d93d
 
9884bb7
 
 
 
 
 
 
 
 
 
 
 
 
 
d641446
 
9884bb7
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
license: apache-2.0
tags:
  - text-classification
  - topic-analysis
  - vietnamese
  - vsfc
  - phobert
language:
  - vi
datasets:
  - uit-vsfc
model-index:
  - name: VSFC Topic Classifier (PhoBERT)
    results:
      - task:
          type: text-classification
          name: Topic Classification
        dataset:
          name: UIT-VSFC
          type: uit-vsfc
        metrics:
          - type: accuracy
            value: 89.1346
          - type: f1
            value: 89.0436
---

# VSFC TOPIC Classifier using PhoBERT

This model is fine-tuned from [`vinai/phobert-base`](https://huggingface.co/vinai/phobert-base) on the UIT-VSFC dataset for Vietnamese Students Feedback Corpus topic analysis.

## 🧠 Model Details

- **Model type**: Transformer (BERT-based)
- **Base model**: [`vinai/phobert-base`](https://huggingface.co/vinai/phobert-base)
- **Fine-tuned task**: Sentence-level topc classification
- **Target labels**: Lecturer, Training program, Facility, Others
- **Tokenizer**: SentencePiece BPE

## 📚 Training Data

- **Dataset**: [UIT-VSFC](https://drive.google.com/drive/folders/1xclbjHHK58zk2X6iqbvMPS2rcy9y9E0X)
- **Language**: Vietnamese
- **License**: Academic use
- Students’ feedback is a vital resource for the interdisciplinary research involving the combining of two different research fields between sentiment analysis and education.

## 🚀 How to Use

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("tmt3103/VSFC-topic-classify-phoBERT")
model = AutoModelForSequenceClassification.from_pretrained("tmt3103/VSFC-topic-classify-phoBERT")

inputs = tokenizer("Giảng viên thân thiện dễ thương", return_tensors="pt")
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=-1).item()