Author(s)

Ghoghari Dhwani, GOSWAMI YASHVI, Prof. Shwetaba B. Chauhan, Prof. Dhaval R. Chandarana

  • Manuscript ID: 140086
  • Volume: 2
  • Issue: 1
  • Pages: 198–207

Subject Area: Computer Science

Abstract

WhatsApp has become a common medium for educational communication, yet it lacks any built-in content moderation tools [1]. Unchecked group chats risk exposing students to harassment, misinformation, and other inappropriate content [2]. This paper presents a Natural Language Processing (NLP)-based moderation model for classroom WhatsApp groups. The model monitors messages in real-time and classifies them to detect policy violations such as offensive language, bullying, or sensitive disclosures. Offending messages are automatically removed or flagged, and corrective feedback is provided to students, with severe cases escalated to educators. We combine a simple keyword filter with a statistical classifier to balance interpretability and accuracy [3]. In evaluation, the system achieved high accuracy in identifying harmful messages while minimizing false alarms. A pilot deployment in a student group showed that fewer than 0.1% of messages required intervention, and all flagged incidents (barring one false positive) were indeed problematic [4]. We discuss ethical considerations including privacy, bias, and the importance of human oversight. Our findings suggest that an NLP-driven moderation tool can help maintain a safe and positive online classroom environment within the 10-page limit for content and references.

Keywords