Toxicity Classifier
Overview
This model is a fine-tuned BERT model designed for detecting toxicity in text. It classifies input text as either "toxic" or "non-toxic" based on learned patterns from a diverse dataset of online comments and discussions. The model achieves high accuracy in identifying harmful language, making it suitable for content moderation tasks.
Model Architecture
The model is based on the BERT (Bidirectional Encoder Representations from Transformers) architecture, specifically bert-base-uncased. It consists of 12 transformer layers, each with 12 attention heads and a hidden size of 768. The final layer is a classification head that outputs probabilities for the two classes: non-toxic (0) and toxic (1).
Intended Use
This model is intended for use in applications requiring automated toxicity detection, such as:
- Social media platforms for moderating user comments.
- Online forums to flag potentially harmful content.
- Customer support systems to identify abusive language in queries.
It can be integrated into pipelines using the Hugging Face Transformers library. Example usage:
from transformers import pipeline
classifier = pipeline("text-classification", model="your-username/toxicity-classifier")
result = classifier("This is a harmful comment.")
print(result)
- Downloads last month
- 8