Show simple item record

contributor authorMuhealddin Awlla, Kozhin
contributor authorVeisi, Hadi
contributor authorAbas Abdullah, Abdulhady
date accessioned2025-02-21T19:12:05Z
date available2025-02-21T19:12:05Z
date issued2025
identifier urihttp://192.64.112.23/xmlui/handle/311/91
description abstractThis paper enhances the study of sentiment analysis for the Central Kurdish lan- guage by integrating the Bidirectional Encoder Representations from Transformers (BERT) into Natural Language Processing techniques. Kurdish is a low-resourced language, having a high level of linguistic diversity with minimal computational resources, making sentiment analysis somewhat challenging. Earlier, this was done using a traditional word embedding model, such as Word2Vec, but with the emer- gence of new language models, specifically BERT, there is hope for improvements. The better word embedding capabilities of BERT lend to this study, aiding in the capturing of the nuanced semantic pool and the contextual intricacies of the lan- guage under study, the Kurdish language, thus setting a new benchmark for senti- ment analysis in low-resource languages. The steps include collecting and normal- izing a large corpus of Kurdish texts, pretraining BERT with a special tokenizer for Kurdish, and developing different models for sentiment analysis including Bidi- rectional Long Short-Term Memory (BiLSTM), Multi-Layer Perceptron (MLP), and finetuning the BERT classifier. The proposed approach consists of 3 classes: positive, negative, and neutral sentiment analysis using a sentiment embedding of BERT in four different configurations. The accuracy of the best-performing clas- sifier, BiLSTM, is 74.09%. For the BERT with an MLP classifier model, the maxi- mum accuracy achieved is 73.96%, while the fine-tuned BERT model tops the oth- ers with 75.37% accuracy. Additionally, the fine-tuned BERT model demonstrates a vast improvement when focused on two 2-class sentiment analyses positive and negative with an accuracy of 86.31%. The study makes a comprehensive compari- son, highlighting BERT’s superiority over the traditional ones based on accuracy and semantic understanding. It is motivated because several results are obtained that the proposed BERT-based models outperform Word2Vec models conventionally used here by a remarkable accuracy gain in most sentiment analysis tasksen_US
language isoen_USen_US
publisherLanguage Resources and Evaluationen_US
subjectSentiment analysisen_US
subjectDeep learningen_US
subjectBERTen_US
subjectBiLSTMen_US
subjectCentral Kurdish languageen_US
titleSentiment analysis in low‑resource contexts: BERT’s impact on Central Kurdishen_US
typeArticleen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record