Main Article Content
Gender recognition by voice is one of the most demanding phenomena in speech analysis. With increasing use of digital communication channels, many speech analysis techniques are being used to identify gender by acoustic features of speaker’s voice. In this paper, an algorithm is presented to develop a tool using Praat Script to classify speaker’s gender by analysing sound features such as pitch, formants, and MFCC coefficients with various speech processing techniques broadly categorized into composite and multi-layer feature approaches. Euclidean distance and Naïve Bayes are implemented to compare cumulative feature vector containing fundamental frequency, formant frequencies, and MFCC coefficients, with base vector of aforementioned sound features that is obtained through supervised training using Texas Instruments and Massachusetts Institute of Technology (IMIT) speech corpus. Techniques are further refined to get more accurate outcome by applying fuzzy logic rule base by aggregating their results. Algorithm is also designed to make it efficient in terms of processing time, accuracy, and reliability by eliminating the frames having undefined F0 and removing outliers while identifying sound features. Multi-layer feature approach achieves 98% accuracy in gender recognition as compare to composite approach which returns just 77% with sample dataset of 133 Urdu language speakers’ voices, obtained through Pakistani Urdu dramas.
Copyright (c) 2019 Technical Journal
The author transfers all copyright ownership of the manuscript entitled (title of article) to the Technical Journal in the event the work is published.