IISc Logo    Title

etd AT Indian Institute of Science >
Division of Mechanical Sciences  >
Aerospace Engineering (aero) >

Please use this identifier to cite or link to this item: http://hdl.handle.net/2005/1094

Title: Speech Signal Classification Using Support Vector Machines
Authors: Sood, Gaurav
Advisors: Balakrishnan, N
Keywords: Speech Recognition
Speech Signal Processing
Automatic Speech Recognition
Artificial Neural Networks
Support Vector Machine
Time Normalization
Hidden Markov Models (HMMs)
Submitted Date: Jul-2009
Series/Report no.: G23702
Abstract: Hidden Markov Models (HMMs) are, undoubtedly, the most employed core technique for Automatic Speech Recognition (ASR). Nevertheless, we are still far from achieving high‐performance ASR systems. Some alternative approaches, most of them based on Artificial Neural Networks (ANNs), were proposed during the late eighties and early nineties. Some of them tackled the ASR problem using predictive ANNs, while others proposed hybrid HMM/ANN systems. However, despite some achievements, nowadays, the dependency on Hidden Markov Models is a fact. During the last decade, however, a new tool appeared in the field of machine learning that has proved to be able to cope with hard classification problems in several fields of application: the Support Vector Machines (SVMs). The SVMs are effective discriminative classifiers with several outstanding characteristics, namely: their solution is that with maximum margin; they are capable to deal with samples of a very higher dimensionality; and their convergence to the minimum of the associated cost function is guaranteed. In this work a novel approach based upon probabilistic kernels in support vector machines have been attempted for speech data classification. The classification accuracy in case of support vector classification depends upon the kernel function used which in turn depends upon the data set in hand. But still as of now there is no way to know a priori which kernel will give us best results The kernel used in this work tries to normalize the time dimension by fitting a probability distribution over individual data points which normalizes the time dimension inherent to speech signals which facilitates the use of support vector machines since it acts on static data only. The divergence between these probability distributions fitted over individual speech utterances is used to form the kernel matrix. Vowel Classification, Isolated Word Recognition (Digit Recognition), have been attempted and results are compared with state of art systems.
URI: http://etd.iisc.ernet.in/handle/2005/1094
Appears in Collections:Aerospace Engineering (aero)

Files in This Item:

File Description SizeFormat
G23702.pdf1.27 MBAdobe PDFView/Open

Items in etd@IISc are protected by copyright, with all rights reserved, unless otherwise indicated.


etd@IISc is a joint service of SERC & IISc Library ||
|| Powered by DSpace || Compliant to OAI-PMH V 2.0 and ETD-MS V 1.01