dc.description.abstract |
Sentiment analysis is the problem of determining the polarity of a text in specific domain and
conducted at coarse-grained or fine-grained level. Analysis at the document, sentence, and/or
review level is categorized as coarse-grained, whereas analysis at the clause level, or subsentence,
phrase, or word level, is categorized as fine-grained and enable in-depth investigation of a certain
emotion or mood of users. For single sentence can handle compound polar terms in domain, it is
impossible to determine the polarity of a text in simple. Therefore, it is crucial to extract both
negative and positive, as well as neutral, noted opinions in Afan Oromo sentence structure at a
more granular level as fine-grained sentiment analysis accurately captures the sentiment of
people's annotated wisely. It is challenging to generate shorter textual fragments that seldom
convey adequate information to detect the polarity out of context for sentiment analysis algorithms
that focus on Afan Oromo phrase/clause components. Targetly we focused on political domain and
collected about 3168 Polar Clauses labeled into three classes.
In thisstudy,we employed supervised machine learning classifiers: Naïve Bayes: we trained Naïve
Bayes with its tunning parameters: MNB,BNB, CNB, and GNB with TFIDF feature extractor each
provides 89.43%, 89.12%,90.54%, 90.54% experimental result. We trained MNB with N-grams
hybrid approaches of terms as Unigram with Bigram provides accuracy 94.95% and Unigram
trigram provides 95.43%., Support Vector Machine: achieved 0.9479 ≈ 95% by TF-IDF and with
Grid Search Estimator SVC provides 0.95268 ≈ 95%, trying to improve the result we employed
LSVC, SGDC on TFIDF(98.11%, 98.42%,) and Count vectorize data(90.54%, 88.49%) provided
in parallel. Bi-directional Long-Short Term Memory and Long-Short Term Memory, were provided
accuracy 88.33% and 90.11%. respectively. Based on experimental result SVM and NB are better
and good for analysis this study with relative data size while both BiLSTM and LSTM less and need
more data size. In this study challenges due to lack grammatical spelling, stemming, normalization,
conjunction-based clause building and splitting was risen in Afan Oromo words since language is
under resourced and it need more attention to do best. Dataset size problem is one limitation in our
study in order to train and provide good analysis results using deep learning models. |
en_US |