WelCome to Ambo University Institutional Repository!!

Sentence Based Paraphrase Detection Model Using Deep Learning Approach In Case Of Amharic Language.

Show simple item record

dc.contributor.author Eyob, Kefelegn
dc.date.accessioned 2023-10-27T11:24:56Z
dc.date.available 2023-10-27T11:24:56Z
dc.date.issued 2022-11
dc.identifier.uri http://hdl.handle.net/123456789/3147
dc.description.abstract The purpose of the paraphrase identification (PI) problem is to determine if two statements are similar enough in meaning to be classified as paraphrases, and it is the task of automatically recognizing whether sentence pairs have the same meaning, but It is difficult to accurately define the criteria for semantic equivalence (that is, the same or almost the same meaning) and can vary from task to task, and It is usually a binary classification issue. It is an alternative expression with the same (or similar) meaning. For example, "መርሳት" is a paraphrased form of "ማስታ዆ስአሇመቻሌ". The identification of paraphrases and the degree of their semantic similarity have proven useful in many NLP applications (Erfaneh Gharavi, Kayvan Bijari and Kiarash Zahirnia, 2017). For example, it can be used as a feature to enhance many other NLP tasks such as Information retrieval, machine translation scoring, text summarization, question answering, etc. Although a lot of paraphrase identification systems have been developed for various natural language texts, but no research has been conducted yet for Amharic Language. The proposed model will consider different word embedding methods such as word2vec, and fastText, and also we will use three different deep learning models such as BiLSTM_GRN, Siamese Network, and Feature Fusion Network models, to detect the paraphrased Sentence automatically and compare accuracy of all models. The proposed model will help people to detect the paraphrased sentence accurately and quickly, in order to avoid duplicate sentences that entail the same meaning and also to detect palajarism. Since there is no publicly available Amharic paraphrase dataset, the Dataset used for this purpose is gathered from online public available dataset of Addis Ababa University Institutional Repository which contains the collection of Amharic language masters of Art student‟s thesis. Then prepared the dataset consists of pairs of annotated sentences with linguistic expert of the domain. While 80% of the data is used for train and develop deep learning models, and the remaining 20% is used to test the performance of the model. Accordingly, the Siamese neural network model scored an accuracy of 0.9583 with fastText word embedding, which is a promising performance for automatic paraphrase detection for Amharic langage than BiLSTM-GRN and FFN models. en_US
dc.language.iso en en_US
dc.publisher Ambo University en_US
dc.subject Amharic en_US
dc.subject Gated Relevance Network(GRN) en_US
dc.subject Deep Learning(DL) en_US
dc.title Sentence Based Paraphrase Detection Model Using Deep Learning Approach In Case Of Amharic Language. en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search AmbouIR


Advanced Search

Browse

My Account