Enhanced Relative Discrimination Criterion Approach for Feature Ranking in Text Classification
Main Article Content
Abstract
The textual information has been increasing at an unprecedented pace in various fields posing a great challenge on the extraction of meaningful information and insight. The key to utilizing this data is effective feature selection which is especially important in improving the functionality of text classification tasks. The classical methods of feature selection, which have been developed to work with numerical or categorical data, are limited when used with text data due to high dimensionality, sparsity, and semantic complexity. This paper proposes the Modified Relative Discrimination Criterion (MRDC) as a superior feature ranking approach that is designed to fit text classification. The MRDC approach seeks to eliminate the inadequacies of the traditional approaches by capitalizing on the discriminatory capacity of characteristics in a text corpus. The suggested approach effectively records the relative value of every feature, and it provides a powerful ranking, which enhances the selection process. The performance of the proposed MRDC approach was evaluated with accuracy, precision, recall, F1-score and classification report. The results show that MRDC achieved 82.13% accuracy, 82.50% precision, 82.20% recall, and 82.22% F1-score with 1500 selected features.
Article Details
The author transfers all copyright ownership of the manuscript entitled (title of article) to the Technical Journal in the event the work is published.