Characterization of speaker recognition in noisy channels

Show full item record

Redirect: RIT Scholars content from RIT Digital Media Library has moved from to RIT Scholar Works, please update your feeds & links!
Title: Characterization of speaker recognition in noisy channels
Author: Ghilduta, Robert
Abstract: Speaker recognition is a frequently overlooked form of biometric security. Text-independent speaker identification is used by financial services, forensic experts, and human computer interaction developers to extract information that is transmitted along with a spoken message such as identity, gender, age, emotional state, etc. of a speaker. Speech features are classified as either low-level or high-level characteristics. Highlevel speech features are associated with syntax, dialect, and the overall meaning of a spoken message. In contrast, low-level features such as pitch, and phonemic spectra are associated much more with the physiology of the human vocal tract. It is these lowlevel features that are also the easiest and least computationally intensive characteristics of speech to extract. Once extracted, modern speaker recognition systems attempt to fit these features best to statistical classification models. One such widely used model is the Gaussian Mixture Model (GMM). The current standard of testing of speaker recognition systems is standardized by NIST in the often updated NIST Speaker Recognition Evaluation (NIST-SRE) standard. The results measured by the tests outlined in the standard are ultimately presented as Detection Error Tradeoff (DET) curves and detection cost function scores. A new method of measuring the effects of channel impediments on the quality of identifications made by Gaussian Mixture Model based speaker recognition systems will be presented in this thesis. With the exception of the NIST-SRE, no standardized or extensive testing of speaker recognition systems in noisy channels has been conducted. Thorough testing of speaker recognition systems will be conducted in channel model simulators. Additionally, the NIST-SRE error metric will be evaluated against a new proposed metric for gauging the performance and improvements of speaker recognition systems.
Record URI:
Date: 2012-03

Files in this item

Files Size Format View
RGhildutaThesis3-2012.pdf 1.035Mb PDF View/Open

The following license files are associated with this item:

This item appears in the following Collection(s)

Show full item record

Search RIT DML

Advanced Search