WelCome to Ambo University Institutional Repository!!

Paraphrase Detection in Afan Oromo Texts Using Deep Learning Techniques and its Application in automatic Plagiarism Detection

Show simple item record

dc.contributor.author Wakjira, Bekele
dc.date.accessioned 2023-11-02T06:46:33Z
dc.date.available 2023-11-02T06:46:33Z
dc.date.issued 2021-12
dc.identifier.uri http://hdl.handle.net/123456789/3182
dc.description.abstract This study reports our investigation and experiments on the development of paraphrase detection and its application in automatic plagiarism detection for Afan Oromo texts. Paraphrasing is making a sentence in another form, like changing the sentence by the synonym of a keyword, adding a phrase to the word, or adding more details to a particular word; which is a way of conveying the same message without compromising the meaning. However, due to the rapidly increasing digital media and paraphrasing tools, paraphrasing increases the opportunity to commit paraphrase plagiarism, which is difficult to detect easily. Plagiarism is a persistent headache that plagiarism detection systems face because most plagiarism detection systems (many of which are commercially based) are designed to detect word co-occurrences and light modifications but they are incapable of detecting severe semantic, structural, and paraphrase texts. Paraphrase detection is a natural language processing task that involves determining the degree to which two text segments are related and has a great role to detect paraphrase plagiarism. Paraphrase detection has many applications in the field of natural language processing and understanding, such as machine translation, information retrieval, and question-answering. However, many research studies have been reported and implemented to detect paraphrases for resource-rich languages such as English, Chinese, German, French, and so on. To the best of the researcher's knowledge, there is no formal study reported on resource-scarce Ethiopian languages like Afan Oromo, Amharic, Somali, Sidama, and so on. Therefore, this study aimed to design and develop an automatic paraphrase detection model for Afan Oromo texts using deep learning techniques. To this end, a dataset was gathered and prepared from Afan Oromo documents publicly available at the Addis Ababa University Institutional Repository. First of all, we performed text preprocessing and data annotation tasks in cooperation with domain experts. While 80% of the data is used for training and creating deep learning models, the remaining 20% is used to test the performance of the model. Accordingly, the convolutional neural network model scored an accuracy of 67% with fast-Text word embedding, which is a promising performance for automatic paraphrase detection for Afan Oromo texts. en_US
dc.language.iso en en_US
dc.publisher Ambo University en_US
dc.subject Afan Oromo en_US
dc.subject Deep Learning en_US
dc.subject Paraphrase en_US
dc.title Paraphrase Detection in Afan Oromo Texts Using Deep Learning Techniques and its Application in automatic Plagiarism Detection en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search AmbouIR


Advanced Search

Browse

My Account