| description abstract | This paper enhances the study of sentiment analysis for the Central Kurdish lan-
guage by integrating the Bidirectional Encoder Representations from Transformers
(BERT) into Natural Language Processing techniques. Kurdish is a low-resourced
language, having a high level of linguistic diversity with minimal computational
resources, making sentiment analysis somewhat challenging. Earlier, this was done
using a traditional word embedding model, such as Word2Vec, but with the emer-
gence of new language models, specifically BERT, there is hope for improvements.
The better word embedding capabilities of BERT lend to this study, aiding in the
capturing of the nuanced semantic pool and the contextual intricacies of the lan-
guage under study, the Kurdish language, thus setting a new benchmark for senti-
ment analysis in low-resource languages. The steps include collecting and normal-
izing a large corpus of Kurdish texts, pretraining BERT with a special tokenizer for
Kurdish, and developing different models for sentiment analysis including Bidi-
rectional Long Short-Term Memory (BiLSTM), Multi-Layer Perceptron (MLP),
and finetuning the BERT classifier. The proposed approach consists of 3 classes:
positive, negative, and neutral sentiment analysis using a sentiment embedding of
BERT in four different configurations. The accuracy of the best-performing clas-
sifier, BiLSTM, is 74.09%. For the BERT with an MLP classifier model, the maxi-
mum accuracy achieved is 73.96%, while the fine-tuned BERT model tops the oth-
ers with 75.37% accuracy. Additionally, the fine-tuned BERT model demonstrates
a vast improvement when focused on two 2-class sentiment analyses positive and
negative with an accuracy of 86.31%. The study makes a comprehensive compari-
son, highlighting BERT’s superiority over the traditional ones based on accuracy
and semantic understanding. It is motivated because several results are obtained that
the proposed BERT-based models outperform Word2Vec models conventionally
used here by a remarkable accuracy gain in most sentiment analysis tasks | en_US |