Design and Development of Part of Speech Tagger for Ge'ez Language Using Hybrid Approach

##plugins.themes.academic_pro.article.main##

Gebremeskel Hagos Gerbremedhin

Abstract

Part of Speech tagging is the process of assigning part of speech or other lexical class markers to each word in a sentence or literature. It is the first step to understanding a natural language. Most other tasks and applications heavily depend on it. As to the best of the researcher's knowledge, Ge'ez is the language which does not have developed POS tagger so far. Therefore, this work proposes a hybrid approach, Trigram N tag tagger combined with human written rule, Regular expression and morphological pattern analysis-based tagger, for Ge'ez part of speech tagger.

Ge'ez literatures on syntax, morphology and grammar are reviewed to understand nature of the language and also to identify possible tag sets. Since there was no readymade standard corpus for Ge'ez language, as a result, 26 broad tag sets were identified and 15,154 words from around 1,305 sentences collected from one genre i.e., Holy bible. Then, those words were manually tagged by Ge'ez language professionals for training and testing purpose. Several techniques have been suggested to tag words automatically with their POS tags. Among these, the hybrid of TnT with human annotated rule, regex and morphological pattern analysis of Ge'ez language is assumed to perform better than the TnT taggers taken alone. Different experiments are conducted for the three types of taggers namely the TnT tagger, TnT with Regex tagger and Hybrid tagger. Therefore, 77.87%, 82.23% and 94.32% performances are obtained for TnT tagger, TnT with Regex tagger and Hybrid taggers respectively. Therefore, it is possible to conclude that the hybrid tagger performs better than the TnT tagger and TnT with Regex tagger used individually.

##plugins.themes.academic_pro.article.details##

How to Cite
Gerbremedhin, G. H. (2019). Design and Development of Part of Speech Tagger for Ge’ez Language Using Hybrid Approach. The International Journal of Science & Technoledge, 7(12). https://doi.org/10.24940/theijst/2019/v7/i12/ST1912-009