An improved self-training model to detect fake news categories using multi-class classification of unlabeled data

Fake news classification with unlabeled data


  • Oumaima Stitini
  • Soulaimane Kaloun
  • Omar Bencharef
  • Sara Qassimi



Multi-class classification, Unlabeled data, Semi-supervised learning, Self-training, Recommender system, Fake news, Imbalanced Learning


In recent times, significant attention has been devoted to classifying news content in academic and industrial settings. Some studies have focused on distinguishing between fake and real news using labeled data and have achieved some success in detection. Digital misinformation or fake news content spreads through online social communities via shares, re-shares, and re-posts. Social media has faced several challenges in combating the distribution of fake news information. Social media platforms and blogs have become widely used daily sources of information due to their low cost and ease of access. However, this widespread use of social media for news consumption has led to the dissemination of fake news, creating a severe problem that adversely affects individuals and society. Consequently, identifying and addressing misinformation has become an essential and critical task. Detecting fake news is an emerging research area that has garnered considerable interest, but it also presents specific challenges, mainly due to the limitations of available resources. In this paper, we focus on identifying and classifying different forms of fake news using unlabeled data, specifically exploring how to use unlabeled data for multi-class classification. The proposed approach categorizes fake news into four forms: satire or fake satirical information, manufacturing, manipulation, and propaganda. Our method employs a relevant approach based on multi-class classification using unlabeled data. The experimental evaluation demonstrates the efficiency of our suggested system.