Toxicity Classifier

Overview

This model is a fine-tuned BERT model designed for detecting toxicity in text. It classifies input text as either "toxic" or "non-toxic" based on learned patterns from a diverse dataset of online comments and discussions. The model achieves high accuracy in identifying harmful language, making it suitable for content moderation tasks.

Model Architecture

The model is based on the BERT (Bidirectional Encoder Representations from Transformers) architecture, specifically bert-base-uncased. It consists of 12 transformer layers, each with 12 attention heads and a hidden size of 768. The final layer is a classification head that outputs probabilities for the two classes: non-toxic (0) and toxic (1).

Intended Use

This model is intended for use in applications requiring automated toxicity detection, such as:

  • Social media platforms for moderating user comments.
  • Online forums to flag potentially harmful content.
  • Customer support systems to identify abusive language in queries.

It can be integrated into pipelines using the Hugging Face Transformers library. Example usage:

from transformers import pipeline

classifier = pipeline("text-classification", model="your-username/toxicity-classifier")
result = classifier("This is a harmful comment.")
print(result)
Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Evaluation results