Propaganda Detection with Historical Context

This model is a fine-tuned version of IDA-SERICS/PropagandaDetection specifically trained to detect historical context manipulation and geopolitical propaganda.

Model Description

  • Base Model: IDA-SERICS/PropagandaDetection (DistilBERT-based)
  • Fine-tuned on: 168 examples of historical context propaganda
  • Training Accuracy: 100% validation accuracy
  • Performance Improvement: +66.7 percentage points on geopolitical propaganda detection

Key Capabilities

The model excels at detecting:

  • Geopolitical Framing: Biased presentation of conflicts ("Israel's war", "special military operation")
  • Genocide Denial: Language that minimizes or denies documented genocides
  • War Crimes Euphemisms: Sanitized language for documented violations ("collateral damage", "surgical strikes")
  • False Equivalence: Creating false moral equivalence between different actions
  • Victim Blaming: Language that blames victims of historical atrocities
  • Historical Revisionism: Attempts to rewrite established historical facts

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("bytop-ai/propaganda-detection-historical-context")
model = AutoModelForSequenceClassification.from_pretrained("bytop-ai/propaganda-detection-historical-context")

# Create pipeline
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)

# Analyze text
result = classifier("Israel's war against Hamas continues")
print(f"Propaganda: {result[0]['label']} ({result[0]['score']:.3f})")

Performance

Model Type Historical Context Detection Overall Accuracy
Base Model 16.7% 54.5%
This Model 83.3% 90.9%
Improvement +66.7 points +36.4 points

Training Data

The model was fine-tuned on a carefully curated dataset of 168 examples including:

  • 100 historical context propaganda examples
  • 68 traditional propaganda examples
  • Balanced representation of different propaganda techniques
  • Examples covering Israel-Palestine, Ukraine-Russia, and other geopolitical contexts

Ethical Considerations

This model is designed for educational purposes to help users:

  • Recognize propaganda techniques in media
  • Understand how language can obscure accountability
  • Develop critical thinking about historical framing
  • Identify bias in geopolitical reporting

The model's classifications are based on documented facts and international law, not political positions.

Limitations

  • Trained primarily on English text
  • Focused on contemporary geopolitical contexts
  • May not generalize to all historical periods
  • Requires careful interpretation in sensitive contexts

Citation

@misc{bytop-propaganda-historical-context-2024,
  title={Fine-tuned Propaganda Detection with Historical Context},
  author={BytoP.ai},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/bytop-ai/propaganda-detection-historical-context}
}

Model Architecture

  • Architecture: DistilBERT for Sequence Classification
  • Parameters: ~67M (same as base model)
  • Max Sequence Length: 512 tokens
  • Labels: 2 (NO_PROPAGANDA: 0, PROPAGANDA: 1)

Training Details

  • Training Epochs: 5
  • Learning Rate: 2e-5
  • Batch Size: 4
  • Validation Strategy: 20% held-out
  • Early Stopping: 3 patience
  • Final Validation Accuracy: 100%

Example Detections

Propaganda (Detected)

  • "Israel's war against Hamas continues as they defend themselves" β†’ PROPAGANDA (99.7%)
  • "Russia's special military operation to denazify Ukraine" β†’ PROPAGANDA (95.9%)
  • "The alleged genocide is just Hamas propaganda" β†’ PROPAGANDA (99.8%)

Neutral (Not Detected)

  • "UN reports document civilian casualties in the conflict" β†’ NO_PROPAGANDA (99.9%)
  • "International observers documented attacks on hospitals" β†’ NO_PROPAGANDA (99.9%)
  • "The ICC is investigating potential war crimes" β†’ NO_PROPAGANDA (99.9%)

For more information about the full propaganda detection system, visit the GitHub repository.

Downloads last month
7
Safetensors
Model size
67M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for bytop/propaganda-detection-historical-context

Finetuned
(1)
this model