Context-Aware Stemming algorithm for semantically related root words

Loading...
Thumbnail Image

Date

2012

Journal Title

Journal ISSN

Volume Title

Publisher

Institute of Electrical and Electronics Engineers (IEEE) Inc.

Abstract

There is a growing interest in the use of context-awareness as a technique for developing pervasive computing applications that are flexible and adaptable for users. In this context, however, information retrieval (IR) is often defined in terms of location and delivery of documents to a user to satisfy their information need. In most cases, morphological variants of words have similar semantic interpretations and can be considered as equivalent for the purpose of IR applications. Consequently, document indexing will also be more meaningful if semantically related root words are used instead of stems. The popular Porter’s stemmer was studied with the aim to produce intelligible stems. In this paper, we propose Context-Aware Stemming (CAS) algorithm, which is a modified version of the extensively used Porter’s stemmer. Considering only generated meaningful stemming words as the stemmer output, the results show that the modified algorithm significantly reduces the error rate of Porter’s algorithm from 76.7% to 6.7% without compromising the efficacy of Porter’s algorithm.

Description

Keywords

Context awareness, Information retrieval, Stemming, Precision, Recall

Citation

Agbele, K.K. (2012). Context-Aware Stemming algorithm for semantically related root words. African Journal of Computing & ICT, 5(4): 33-42