Fine-Grained Sentiment Analysis In Afan Oromo Using Machine Learning Algorithms

Merga, Kumela

AUIR Home
→
Institute of Technology
→
Department of Information Technology
→
Theses and Dissertations of this Department
→
View Item

dc.contributor.author	Merga, Kumela
dc.date.accessioned	2023-10-31T07:17:38Z
dc.date.available	2023-10-31T07:17:38Z
dc.date.issued	2023-06
dc.identifier.uri	http://hdl.handle.net/123456789/3165
dc.description.abstract	Sentiment analysis is the problem of determining the polarity of a text in specific domain and conducted at coarse-grained or fine-grained level. Analysis at the document, sentence, and/or review level is categorized as coarse-grained, whereas analysis at the clause level, or subsentence, phrase, or word level, is categorized as fine-grained and enable in-depth investigation of a certain emotion or mood of users. For single sentence can handle compound polar terms in domain, it is impossible to determine the polarity of a text in simple. Therefore, it is crucial to extract both negative and positive, as well as neutral, noted opinions in Afan Oromo sentence structure at a more granular level as fine-grained sentiment analysis accurately captures the sentiment of people's annotated wisely. It is challenging to generate shorter textual fragments that seldom convey adequate information to detect the polarity out of context for sentiment analysis algorithms that focus on Afan Oromo phrase/clause components. Targetly we focused on political domain and collected about 3168 Polar Clauses labeled into three classes. In thisstudy,we employed supervised machine learning classifiers: Naïve Bayes: we trained Naïve Bayes with its tunning parameters: MNB,BNB, CNB, and GNB with TFIDF feature extractor each provides 89.43%, 89.12%,90.54%, 90.54% experimental result. We trained MNB with N-grams hybrid approaches of terms as Unigram with Bigram provides accuracy 94.95% and Unigram trigram provides 95.43%., Support Vector Machine: achieved 0.9479 ≈ 95% by TF-IDF and with Grid Search Estimator SVC provides 0.95268 ≈ 95%, trying to improve the result we employed LSVC, SGDC on TFIDF(98.11%, 98.42%,) and Count vectorize data(90.54%, 88.49%) provided in parallel. Bi-directional Long-Short Term Memory and Long-Short Term Memory, were provided accuracy 88.33% and 90.11%. respectively. Based on experimental result SVM and NB are better and good for analysis this study with relative data size while both BiLSTM and LSTM less and need more data size. In this study challenges due to lack grammatical spelling, stemming, normalization, conjunction-based clause building and splitting was risen in Afan Oromo words since language is under resourced and it need more attention to do best. Dataset size problem is one limitation in our study in order to train and provide good analysis results using deep learning models.	en_US
dc.language.iso	en	en_US
dc.publisher	Ambo University	en_US
dc.subject	Fine-grained	en_US
dc.subject	Sentiment Analysis	en_US
dc.subject	Afan Oromo	en_US
dc.title	Fine-Grained Sentiment Analysis In Afan Oromo Using Machine Learning Algorithms	en_US
dc.type	Thesis	en_US