dc.description.abstract |
Social media is a system for electronic communications that allows users to establish online
communication platforms. Most popular social media platforms include: Facebook, Twitter,
YouTube and other commonly used social blogging platforms. The number of people using
these social media is increasing rapidly and it's important to be aware of what comments and
posts are being shared. As users share their ideas without control on these sites, the spread
of extremist ideas and offensive language becomes a great challenge. In spite of being the
source of inciting and inflaming content, extremists can go undetected for a long period of
time due to the huge amount of online data and the inefficiency of manual detection strategies
that have been practiced in many developing countries like Ethiopia. For resource-rich
western languages like English, a number of studies have been conducted on social media
analytics and social network analysis in relation to detecting extremists using various
machine learning and deep learning techniques. However, to the best of our knowledge, no
formal research has been conducted on Afan Oromo social media comments and posts to
automatically detect extremist and conflict-inciting social media platforms such as Facebook
and Twitter. We have proposed a deep learning-based sentiment analysis solution that detects
extremist texts on social media based on users' comments and posts in Afan Oromo. The
main purpose of this research work is to design and develop a model that automatically
detects extremist content that is commented and posted in Afan Oromo text on social media
such as Facebook. In the first step, we collected comments and posts from the public
Facebook pages of BBC, OBN, FBC, OLF, KFO, and politically influential people using the
Facepager tool. In the second step, text preprocessing tasks are applied and data annotation
tasks are accomplished in consultation with domain experts and Afan Oromo experts. In the
third step, fast-text word embedding was applied for features representations. In the fourth
step, we loaded 80% of the dataset for training deep learning models LSTM, CNN,
LSTM+CNN, and CNN+LSTM to build an extremist detection model, and we compared
their performances in our experiment. Finally, the proposed CNN+LSTM model scored an
91% accuracy with fast-text word embedding, which is a promising performance for
detections of extremist model on social media in Afan Oromo comments and posts. |
en_US |