View All Issues
Closed Set Speaker Identification using Mel Frequency Cepstral Coefficients on Vowels Preceding Nasal Continuants in Kannada | Journal of All India Institute of Speech and Hearing

ISSN


ISSN

Vol 34 No 1 (2015): .
Speech

Closed Set Speaker Identification using Mel Frequency Cepstral Coefficients on Vowels Preceding Nasal Continuants in Kannada

How to Cite
Shivakumar, A. M., & R, R. (1). Closed Set Speaker Identification using Mel Frequency Cepstral Coefficients on Vowels Preceding Nasal Continuants in Kannada. Journal of All India Institute of Speech and Hearing, 34(1), 76-84. Retrieved from http://203.129.241.91/jaiish/index.php/aiish/article/view/845

Abstract

The aim of the present study was to obtain the percentage of speaker identication using vowels preceding nasal continuants in Kannada speaking adult individuals using semi-automatic method. The participants were twenty Kannada speaking adult males in the age range of 21-32 years constituted as Group I. This was further sub grouped as Group II constituting ten speakers. The material was meaningful Kannada words containing long vowels /a:/, /i:/ and /u:/ preceding nasal continuants /m/ and /n/ embedded in Kannada sentences. The participants read the material four times each under two conditions (a) live recording and (b) mobile network recording. The target words were truncated using the PRAAT software. Each vowel preceding nasal was subjected for extraction of Mel Frequency Cepstral Coecients (MFCCs) using Speech Science lab Workbench for Semi-automatic speaker recognition software. The
study was compared under three conditions: (a) Live vs live recording, (b) Mobile network vs mobile network recording and (c) Live vs mobile network recoding. The same was found across the three conditions when the participants reduced from twenty to ten in number. The results of the present study indicated quite high percent of correct speaker identication using MFCCs in Live vs Live and Mobile network vs Mobile network conditions compared to Live vs mobile network condition. The obtained outcome would serve as potential measure in the forensic scenario for identication of speakers using vowels preceding nasal continuants in Kannada. 

References

Arai, T., Amino, K., & Sugawara, T. (2006). Idiosyncrasy of
nasal sounds in human speaker identi cation and their
acoustic properties, Science and Technology, 27(4).233-235.
Atal, B. S. (1974). E ectiveness of linear prediction charac-
teristics of the speech wave for automatic speaker identi-
cation and veri cation. The Journal of the Acoustical
Society of America, 55(6), 1304-1312.
Barinov, A. (2010, November). Voice samples recording and
speech quality assessment for forensic and automatic
speaker identi cation. In Audio Engineering Society
Convention 129. Audio Engineering Society.
Barinov, A., Koval, S., Ignatov, P. & Stolbov, M. (2010).
Channel compensation for forensic speaker identi ca-
tion using inverse processing. Proceedings of Audio
Engineering Society 39th International Conference, 53-
58.
Boersma & Weenink, D. (2009). PRAAT S.1.14 soft-
ware, restricted from http://www.goofull.com/au/program/
14235/speedytunes.html.
Boves, L. and den Os, E. (1998). Speaker recognition
in telecom applications. Proceedings IEEE IVTTA-98,
Torino, 203-208.
Carney, P. J., & Moll, K. L. (1971). A cine
uorographic
investigation of fricative consonant-vowel coarticulation.
Phonetica, 23(4), 193-202.
Chandrika, (2010). The in
uence of handsets and cellular
networks on the performance of a speaker veri cation
system. Unpublished project of Post Graduate Diploma
in Forensic Speech Science and Technology submitted to
University of Mysore, Mysuru.
Deepa, A. (2010). Re-standardization of Kannada Articula-
tion Test. A Dissertation submitted in part ful llment of
nal year M.Sc Speech-Language Pathology, University
of Mysore, Mysuru.
Devi, J. S., Srinivas, Y., & Nandyala, S. P. (2014). Auto-
matic Speech Emotion and Speaker Recognition based
on Hybrid GMM and FFBNN. International Journal on
Computational Sciences & Applications (IJCSA), 4(1),
35-42.
Doddington, G. (1998). Speaker recognition evaluation
methodology- an overview and perspective. Proceed-
ings for RLA2C Workshop on Speaker Recognition and
its Commercial and Forensic Applications, Avignon,
France, 60-66.
Flege, J. E. (1988). Anticipatory and carry-over nasal coar-
ticulation in the speech of children and adults. Journal
of Speech, Language, and Hearing Research, 31(4), 525-
536.
Ghiurcau, M. V., Rusu, C., & Astola, J. (2011).Speaker
recognition in an emotional environment. Proceeding
Signal Processing and Applied Mathematics for Elec-
tronics and Communications, 81-84.
Glenn, J. W., & Kleiner, N. (1968). Speaker identi cation
based on nasal phonation. The Journal of the Acoustical
Society of America, 43(2), 368-372.
Hasan, M. R., Jamil, M., & Rahman, M. G. R. M. S. (2004).
Speaker identi cation using Mel frequency cepstral co-
ecients variations, 1, 4.
Hollien, H. F. (1990). The acoustics of crime. Springer
Science & Business Media.
Hollien, H. F. (2002). Forensic voice identi cation. Aca-
demic Press.
Imperl, B., Kacic, Z. & Horvat, B. (1997). A study of har-
monic features for the speaker recognition. Speech com-
munication, 22, 385-402.
Jakkar, S. S. (2009). Benchmark for speaker identi cation
using Cepstrum. Unpublished project of Post Gradu-
ate Diploma in Forensic Speech Science and Technology
submitted to University of Mysore, Mysuru.
Jyotsna, (2011). Speaker Identi cation using Cepstral Co-
ecients and Mel-Frequency Cepstral Coecients in
Malayalam Nasal Co-articulation. Unpublished project
of Post Graduate Diploma in Forensic Speech Science
and Technology submitted to University of Mysore, My-
suru.
Larson, P. L., & Hamlet, S. L. (1987). Coarticulation e ects
on the nasalization of vowels using nasal/voice amplitude
ratio instrumentation. The Cleft palate journal, 24(4),
286-290.
Vasan, M., Mathur, S., & Dahiya, M. S (2015). E ect of
di erent recording devices on forensic speaker recogni-
tion system. In Proceedings of 23rd All India Foren-
sic Science Conference 2015, Bhopal, Madhya Pradesh.
1.
Medha, S. (2010). Benchmark for speaker identi cation by
cepstral measurement using text-independent data. Un-
published project of Post Graduate Diploma in Forensic
Speech Science and Technology submitted to University
of Mysore, Mysuru.
Mili, C. S. (2003). Labial coarticulation in Malayalam. Un-
published dissertation of Master of Science in Speech
Language Pathology submitted to University of Mysore,
Mysuru.
Ohde, R. N., & Sharf, D. J. (1975). Coarticulatory e ects
of voiced stops on the reduction of acoustic vowel tar-
gets. The Journal of the Acoustical Society of America,
58(4), 923-927.
Patel, K., & Prasad, R. K. (2013). Speech recognition and
veri cation using MFCC & VQ. International Journal of
Emerging Science and Engineering, 1 (7), 33-7.
Pickett, J. M. (1980). The sounds of speech communication.
Baltimore, MD: University Park.
Plumpe, M. D., Quatieri, T. F., & Reynolds, D. (1999).
Modeling of the glottal
ow derivative waveform with
application to speaker identi cation. Speech and Au-
dio Processing, IEEE Transactions on, 7(5), 569-
586.
Rajsekar, A. (2008). Real time speaker recognition using
MFCC and VQ.Thesis submitted in ful llment of Mas-
ter of Technology degree in Electronics and Communi-
cation Engineering to National Institute of Technology,
Rourkela. Downloaded from http://ethesis.nitrkl.ac.in/4151/
1/2.pdf
Ramya, B. M. (2011). Benchmark for speaker iden-
ti cation under electronic vocal disguise using Mel-
Frequency Cepstral Coecients. Unpublished project
of Post Graduate Diploma in Forensic Speech Science
and Technology submitted to University of Mysore, My-
suru.
Rida, Z. A. (2014). Benchmark for speaker identi cation
using nasal continuants in Hindi in direct and mobile
network recording. Unpublished dissertation of Master
of Science in Speech Language Pathology submitted to
University of Mysore, Mysuru.
Rose, P. (2002). Forensic Speaker Identi cation. Taylor and
Francis: London.
Singh, S., & Rajan, E. G. (2011). Vector quantization
approach for speaker recognition using MFCC and in-
verted MFCC. International Journal of Computer Ap-
plications, 17(1), 1-7.
Sreedevi, N. & Vikas, M. D. (2012). Frequency of occurrence
of Phonemes in Kannada. Project funded by AIISH Re-
search Fund (ARF).
Sreevidya, M.S (2010). Speaker Identi cation using Cep-
strum in Kannada Language. Unpublished project of
Post Graduate Diploma in Forensic Speech Science and
Technology submitted to University of Mysore, My-
suru.
Su, L. S., Li, K. P., & Fu, K. S. (1974). Identi cation
of speakers by use of nasal coarticulation. The Jour-
nal of the Acoustical Society of America, 56(6), 1876-
1883.
Suman, S. (2015). Benchmark for speaker identi cation
using MFCC on vowels following nasal continuants
in Kannada. Unpublished project of Post-Graduate
diploma in Forensic Speech-Science and Technology sub-
mitted to University of Mysore, Mysuru.
Tiwari, R., Mehra, A., Kumawat, M., Ranjan, R., Pandey,
B., Ranjan, S. et al. (2010). Expert system for speaker
identi cation using lip features with PCA. In Intelli-
gent Systems and Applications (ISA), 2nd International
Workshop on (pp. 1-4). IEEE.