A Very low bit-rate speech recognition system

Show full item record

Title: A Very low bit-rate speech recognition system
Author: Baecher, Matthew
Abstract: When using extracted speech feature coefficients for speech synthesis, quantization is considered a lossy compression scheme. The data being compressed cannot be recovered or reconstructed exactly. However, in a speech recognition system for command and control purposes, a certain amount of quantization can be allowed, with comparable results. In some cases, quantization even serves to "close the gaps" between the coefficients of the incoming speech signal and those of the templates. Since the coefficients are not being used to reconstruct the signal, a very coarse quantization can be used, enabling a very low bit-rate transmission with very good recognition results. To reduce the bandwidth further, a binary coding procedure, such as Huffman or Arithmetic Coding, can be applied to the quantized coefficients. Upon receipt of the transmission, the quantized coefficients are decoded and used to perform speech recognition. The sets of coefficients are compared to the templates for each of the commands in the vocabulary. Speech, however, is dynamic in nature and a dynamic recognition procedure is needed to allow for different vocal inflections and durations. A procedure called Dynamic Time Warping is used to "warp" the time axis of the templates to more closely fit the information coming in. By combining all these techniques, a very accurate, very low bit-rate recognizer has been developed and is discussed in this paper.
Record URI: http://hdl.handle.net/1850/14360
Date: 2001-10

Files in this item

Files Size Format View
MBaecherThesis10-2001.pdf 6.551Mb PDF View/Open

The following license files are associated with this item:

This item appears in the following Collection(s)

Show full item record

Search RIT DML


Advanced Search

Browse