The Problem of Words Undergoing Sound Changes in Uzbek Stemmers

  • Botir Elov Doctor of philosophy of technical sciences (PhD), associate professor Alisher Navo’i Tashkent State University of Uzbek Language and Literature
  • Botir Elov Doctor of philosophy of technical sciences (PhD), associate professor Alisher Navo’i Tashkent State University of Uzbek Language and Literature
  • Zilola Xusainova PhD student, Alisher Navo’i Tashkent State University of Uzbek Language and Literature
  • Habiba Berdiyeva Master of degree, Alisher Navo’i Tashkent State University of Uzbek Language and Literature
Keywords: Natural Language Processing, NLP, root, stem, sound change, POS tagging

Abstract

Stemming is one of the most common initial data processing steps that can be performed on almost all Natural Language Processing (NLP) projects. In the process of Stemming, it is carried out to remove some part of the word or shorten the word to its root. Several stemming algorithms are used to decide how to cut a word. In determining the stem of Uzbek words, problems such as homonymy of root and suffix with one root, sound changes when the suffix is added to the words, stemming of neologisms and NERs can occur. This article presents models for solving the problem of the occurrence of sound changes in words in the process of performing stemming in the texts of the Uzbek language Corpus.

References

1. Elov B.B., Khamroyeva Sh.M., Abdullayeva O.X., Khusainova Z.Y., Xudayberganov N.U. POS tagging and stemming in Uzbek, Turkish and Uyghur languages. Uzbekistan: language and culture (computer linguistics), 2023/1(6), 41-60 pp.
2. B.B.Elov, Sh.M.Khamroeva, Z.Y.Khusainova. Pipeline conveyer of NLP (natural language processing). Descendants of Muhammad al-Khwarazmi. Scientific-practical and information – analytical Journal, 1 (23) / 2023, 181-192 pp.
3. Sharma, A., Kumar, R., & Mansotra, V. (2016). Proposed Stemming Algorithm for Hindi Information Retrieval. International Journal of Innovative Research in Computer and Communication Engineering (An ISO Certified Organization), 3297(6). https://doi.org/10.15680/IJIRCCE.2016
4. Hajiev A. Word making in Uzbek language. - Tashkent, 2005
5. Paice, C. D. (1990). Another Stemmer. ACM SIGIR Forum, 24(3). https://doi.org/10.1145/101306.101310
6. Sayfullaeva R., Mengliyev B. et al. Current Uzbek literary language. - Tashkent: 2009, 65 p
7. Sadigov A. et al., An introduction to linguistics. – Tashkent. O’qituvchi, 1981. 47-51 pp.
8. Practicum of Uzbek language. Part 1. - Tashkent: 2005, 12 p
9. Sayfullayeva R., Mengliyev B. et al. Modern Uzbek literary language. -Tashkent: 2009, p 671
10. Mirtojiyev M. Phonetics of the Uzbek language. -Tashkent: "Fan", 2013. 307 p
11. N.S. Trubetskoy. Fundamentals of Phonology, M., 1960. 48-49 pp
Published
2023-06-14
How to Cite
Elov, B., Elov, B., Xusainova, Z., & Berdiyeva, H. (2023). The Problem of Words Undergoing Sound Changes in Uzbek Stemmers. Central Asian Journal of Literature, Philosophy and Culture, 4(6), 107-114. Retrieved from https://cajlpc.centralasianstudies.org/index.php/CAJLPC/article/view/905
Section
Articles