View All Issues
Effect of Noise Reduction Technique on Speaker Identification Using Mel-Frequency Cepstral Co-Efficients of Long Vowels | Journal of All India Institute of Speech and Hearing

ISSN


ISSN

Vol 35 No 1 (2016)
Speech

Effect of Noise Reduction Technique on Speaker Identification Using Mel-Frequency Cepstral Co-Efficients of Long Vowels

How to Cite
KR, P., & N, H. (1). Effect of Noise Reduction Technique on Speaker Identification Using Mel-Frequency Cepstral Co-Efficients of Long Vowels. Journal of All India Institute of Speech and Hearing, 35(1), 19-28. Retrieved from http://203.129.241.91/jaiish/index.php/aiish/article/view/920

Abstract

Speech is always accompanied by noise when the speaker is talking in the environment. To improve the intelligibility of speech signal, noise should be reduced using noise reduction softwares. From the existing software the aim of the present study was to examine the eect of noise reduc- tion technique on speaker identication using Mel Frequency Cepstral Co-Ecient (MFCC) on the long vowels in Kannada language. Ten Kannada speaking neuro-typical adults in the age range of 20-35 years (5 males and 5 females) participated in the study. Commonly occurring Kannada meaningful sentences with long vowels /a:/, /i:/, /u:/  was used for reading task. The same was recorded in two dierent conditions: Lab condition and Trac condition. These samples were analyzed under two phases: Before noise reduction (BNR) and After noise reduction (ANR), using Sound Cleaner Software. Speech Science Lab Work bench software was used to extract MFCC for the truncated (PRAAT software) vowels. Results of the study revealed that in Lab condition, Trac condition (BNR), Trac condition (ANR), Lab condition verses trac (BNR) and in Lab condition verses trac condition (ANR), the vowel /i:/ is found to be better followed by /a:/ and /u:/ in the average  percentage of correct speaker identication of the vowels. Overall results revealed vowel /i:/ is better for speaker identication. Hence, the `sound cleaner' has a signicant eect on percent speaker identication by reducing the in uence of noise without majorly aecting the acoustical parameter of certain vowel considered for the present study. 

References

Arjun. M. S. (2015). Benchmark for speaker identi cation
using MFCC on vowels preceding the nasal continuants
in Kannada. Dissertation submitted to the University
of Mysore, Mysore, India.
ASHA Monographs Number 16 (American Speech and Hear-
ing Association, Washington D.C.).
Atal, B. S. (1972), Automatic speaker recognition based
on pitch contours. Journal of the Acoustical Society of
America, 52, 1687-1697.
Atal, B. S. (1974). E ectiveness of linear prediction charac-
teristics of the speech wave for automatic speaker iden-
ti cation and veri cation. Journal the Acoustic Society
of America, 55(6), 1304- 1312.
Atkinson, J. E. (1976). Inter and intra speaker variability in
fundamental voice frequency. Journal of the Acoustical
Society of America, 60(2), 440-445.
Ayesha (2016). Benchmarks for speaker identi cation for
nasal continuants in Urdu in direct and mobile network
recording. Dissertation submitted to the University of
Mysore, Mysore, India.
Barinov, A. S., Koval, S. L., & Ignatov, P. V. (2010).
Forensic Speaker Identi cation based on the Formants
Matching Approach. Forensic Science International
Journal,1-10.
Bechler, D., Grimm, M., & Kroschel. K. (2003). Speaker
tracking with a microphone array using Kalman lter-
ing. Advances in Radio Science, 1, 113-117.
Boersma, P., & Weenink, D. (2009). Praat: doing phonet-
ics by computer (Version 5.1. 12)[Computer program].
Retrieved August 4, 2009.
Besacier, L., Bonastre, J. F., & Fredouille, C. (2000). Local-
ization and selection of speaker-speci c information withstatistical modeling. Speech Communication, 31(2), 89-
106.
Bricker, P.S., & Pruzansky, S. (1976). Speaker recog-
nition: Experimental Phonetics. London: Academic
press.
Chandrika, S. (2010). The in
uence of handsets and cellu-
lar networks on the performance of a speaker veri ca-
tion system. Dissertation submitted to the University of
Mysore, Mysore, India.
Deepa, A., & Savithri, S. R. (2010). Re-standardization of
Kannada articulation test. Student research at AIISH
(Articles based on dissertation done at AIISH), 8, 53-
55.
O'shaughnessy, D. (1987). Speech communication: human
and machine. Universities press.
Fakotakis, N., Anastasios, T., & Kokkinakis, G. (1993).
A text-independent speaker recognition system based
on vowel spotting. Speech Communication, 12(1), 57-
68.
Stan, S., Fingscheidt, T., & Beaugeant, C. (2003). An evalu-
ation of VTS and IMM for speaker veri cation in noise.
In Eighth European Conference on Speech Communica-
tion and Technology.
Fururi. S. (1994). An overview of speaker recognition tech-
nology. Proceeding of ESCA Workshop on Automatic
Speaker Recognition, Identi cation and Veri cation, 1-
8.
Ortega-Garca, J., & Gonzalez-Rodrguez, J. (1996, Octo-
ber). Overview of speech enhancement techniques for
automatic speaker recognition. In Spoken Language,
1996. ICSLP 96. Proceedings., Fourth International
Conference on (Vol. 2, pp. 929-932). IEEE.
Goldstein, U.G. (1975). Speaker-identifying features based
on formant tracks. Journal of Acoustical Society of
America, 59(1), 176-182.
Gold, B., Morgan, N., & Ellis, D. (2011). Speech and audio
signal processing: processing and perception of speech
and music. John Wiley & Sons.
Hasan,R., Jamil, M., Rabbani, G., & Rahman, S. (2004).
Speaker identi cation using Mel Frequency Cepstral Co-
ecients. 3rd international conference on electrical &
computer engineering, 565-568.
Hecker, M. H. (1971). Speaker recognition- An interpre-
tive survey of the literature. ASHA monographs, 16,
1.
Hermansky, H., & Morgan, N. (1994). RASTA processing
of speech. IEEE transactions on speech and audio pro-
cessing, 2(4), 578-589.
Hollien (1990). The Acoustics of Crime: The New Science
of Forensic Phonetics. New York and London: Plenum
Press. xiv+370 pp.
Jakhar, S. S. (2009). Benchmark for speaker identi ca-
tion using Cepstrum. Unpublished manuscript, De-
partment of Speech-Language Sciences, University of
Mysore, Mysore, India.
Jyotsna. (2011). Speaker Identi cation using Cep-
stral Coecients and Mel-Frequency Cepstral Coe-
cients in Malayalam Nasal Coarticulation. Unpublished
manuscript, Department of Speech-Language Sciences,
University of Mysore, Mysore, India.
Kalaiselvi, R., & Ramachandraiah, A. (2010, August). En-
vironmental noise mapping study for heterogeneous traf-
c conditions. In Proceedings of 20th International
Congress on Acoustics, ICA (pp. 23-27).
Kersta, L. G. (1962). Voiceprint Identi cation. Nature,
196, 1253-1257.
Kiukaanniemi, H., Siponen, P. & Mattila, P. (1982). Individ-
ual di erences in the Long-Term Speech Spectra. Folia
Phoniatrica, 34, 21-28.
Kuwabara, H. & Sagisaks, Y., (1995). Acoustic character-
istics of speaker individuality: control and conversion.
Journal of Speech Communication, 16, 165-173.
Lakshmi, P., and Savithri. S.R. (2009). Benchmark for
speaker Identi cation using Vector F1 & F2. Proceedings
of the International Symposium, Frontiers of Research
on Speech & Music, FRSM-2009, 38-41.
Markel, J. & Davis, S. (1979). Test independent speaker
recognition from a large linguistically unconstrained
time-spaced data base. IEEE Transcations on Acoustics,
Speech, and Signal Processing, 27(1), 74-82.
Medha, S. (2010). Benchmark for speaker identi cation
by Cepstrum measurement using text-independent data.
Dissertation submitted to the University of Mysore,
Mysore, India.
Naik, J. (1994). Speaker Veri cation over the telephone
network: database, algorithms and performance, as-
sessment. Proceedings of the ESCA Workshop Auto-
matic Speaker Recognition Identi cation Veri cation,
31-38.
Nolan, F. (1983). Phonetic bases of speaker recognition.
Cambridge: Cambridge University
Nolan, F. (1997). Speaker recognition and forensic phonet-
ics. In Hardcastle & Laver (Eds.), The Handbook of
Phonetic Sciences (pp. 744-767).
Rabiner, L., & Juang, B. H. (1993). Fundamentals of
speech recognition. Englewood cli s. NJ: PTR Prentice
Hall.
Ramya. B.M. (2011). Bench mark for speaker identi cation
under electronic vocal disguise using Mel Frequency Cep-
stral Coecients. Dissertation submitted to the Univer-
sity of Mysore, Mysore, India.
Reyond. A. D. & Rose. R. (1995). Robust text-independent
speaker identi cation using Gaussian Mixture speaker
models. IEEE Transaction Speech Audio Process, 3,
72-83.
Rida, Z, A. (2014). Benchmarks for speaker identi cation
using nasal continuants in Hindi in direct mobile and
network recording. Dissertation submitted to the Uni-
versity of Mysore, Mysore, India.
Rose, P. (2002). Forensic Speaker Identi cation. Taylor and
Francis: London.
Soong. F., Rosenberg. A., Rabiner. L., & Juang. B.
H. (1985). A vector quantization approach to speaker
recognition. Proceedings in the International Confer-
ence on Acoustic Signal Processing, 387-390.
Sreevidya, M. S. (2010). Speaker identi cation using Cep-
strum in Kannada language. Dissertation submitted to
the University of Mysore, Mysore, India.
Stevens, K.N. (1971). Sources of inter and intra speaker
variability in the acoustic properties of speech sounds.
Proceedings 7th International Congress, Phonetic Sci-
ence, Montreal, 206-227.
Tiwari, V. (2010). MFCC and its applications in speaker
recognition. International Journal on Emerging Tech-
nologies, 1(1), 19-22.
Wolf, J. J. (1972). Ecient acoustic parameters for speaker
recognition. The Journal of the Acoustical Society of
America, 51(6B), 2044-2056