Gender Recognition for Urdu language Speakers Using Composite and Multi-Layer Feature Approaches with Fuzzy Logic

Main Article Content

Abdul Ghafar
Nauman Shah
Muhammad Munwar Iqbal

Abstract

Gender recognition by voice is one of the most demanding phenomena in speech analysis. With increasing use of digital communication channels, many speech analysis techniques are being used to identify gender by acoustic features of speaker’s voice. In this paper, an algorithm is presented to develop a tool using Praat Script to classify speaker’s gender by analysing sound features such as pitch, formants, and MFCC coefficients with various speech processing techniques broadly categorized into composite and multi-layer feature approaches. Euclidean distance and Naïve Bayes are implemented to compare cumulative feature vector containing fundamental frequency, formant frequencies, and MFCC coefficients, with base vector of aforementioned sound features that is obtained through supervised training using Texas Instruments and Massachusetts Institute of Technology (IMIT) speech corpus. Techniques are further refined to get more accurate outcome by applying fuzzy logic rule base by aggregating their results. Algorithm is also designed to make it efficient in terms of processing time, accuracy, and reliability by eliminating the frames having undefined F0 and removing outliers while identifying sound features. Multi-layer feature approach achieves 98% accuracy in gender recognition as compare to composite approach which returns just 77% with sample dataset of 133 Urdu language speakers’ voices, obtained through Pakistani Urdu dramas.

Article Details

Section
COMPUTER SCIENCE

Most read articles by the same author(s)