etd@IISc Collection:
http://etd.iisc.ernet.in/2005/20
2018-12-06T20:38:54ZSparsity Motivated Auditory Wavelet Representation and Blind Deconvolution
http://etd.iisc.ernet.in/2005/3987
Title: Sparsity Motivated Auditory Wavelet Representation and Blind Deconvolution
Authors: Adiga, Aniruddha
Abstract: In many scenarios, events such as singularities and transients that carry important information about a signal undergo spreading during acquisition or transmission and it is important to localize the events. For example, edges in an image, point sources in a microscopy or astronomical image are blurred by the point-spread function (PSF) of the acquisition system, while in a speech signal, the epochs corresponding to glottal closure instants are shaped by the vocal tract response. Such events can be extracted with the help of techniques that promote sparsity, which enables separation of the smooth components from the transient ones. In this thesis, we consider development of such sparsity promoting techniques. The contributions of the thesis are three-fold: (i) an auditory-motivated continuous wavelet design and representation, which helps identify singularities; (ii) a sparsity-driven deconvolution technique; and (iii) a sparsity-driven deconvolution technique for reconstruction of nite-rate-of-innovation (FRI) signals. We use the speech signal for illustrating the performance of the techniques in the first two parts and super-resolution microscopy (2-D) for the third part.
In the rst part, we develop a continuous wavelet transform (CWT) starting from an auditory motivation. Wavelet analysis provides good time and frequency localization, which has made it a popular tool for time-frequency analysis of signals. The CWT is a multiresolution analysis tool that involves decomposition of a signal using a constant-Q wavelet filterbank, akin to the time-frequency analysis performed
by basilar membrane in the peripheral human auditory system. This connection motivated us to develop wavelets that possess auditory localization capabilities. Gammatone functions are extensively used in the modeling of the basilar membrane, but the non-zero average of the functions poses a hurdle. We construct bona de wavelets from the Gammatone function called Gammatone wavelets and analyze their properties such as admissibility, time-bandwidth product, vanishing moments, etc..
Of particular interest is the vanishing moments property, which enables the wavelet to suppress smooth regions in a signal leading to sparsi cation. We show how this property of the Gammatone wavelets coupled with multiresolution analysis could be employed for singularity and transient detection. Using these wavelets, we also construct equivalent lterbank models and obtain cepstral feature vectors out of such a representation. We show that the Gammatone wavelet cepstral coefficients (GWCC) are effective for robust speech recognition compared with mel-frequency cepstral coefficients (MFCC).
In the second part, we consider the problem of sparse blind deconvolution (SBD) starting from a signal obtained as the convolution of an unknown PSF and a sparse excitation. The BD problem is ill-posed and the goal is to employ sparsity to come up with an accurate solution. We formulate the SBD problem within a Bayesian framework. The estimation of lter and excitation involves optimization of a cost function that consists of an `2 data- fidelity term and an `p-norm (p 2 [0; 1]) regularizer, as the sparsity promoting prior. Since the `p-norm is not differentiable at the origin, we consider a smoothed version of the `p-norm as a proxy in the optimization. Apart from the regularizer being non-convex, the data term is also non-convex in the filter and excitation as they are both unknown. We optimize the non-convex cost using an alternating minimization strategy, and develop an alternating `p `2 projections algorithm (ALPA). We demonstrate convergence of the iterative algorithm and analyze in detail the role of the pseudo-inverse solution as an initialization for the ALPA and provide probabilistic bounds on its accuracy considering the presence of noise and the condition number of the linear system of equations. We also consider the case of bounded noise and derive tight tail bounds using the Hoe ding inequality.
As an application, we consider the problem of blind deconvolution of speech signals. In the linear model for speech production, voiced speech is assumed to be the result of a quasi-periodic impulse train exciting a vocal-tract lter. The locations of the impulses or epochs indicate the glottal closure instants and the spacing between them the pitch. Hence, the excitation in the case of voiced speech is sparse and its deconvolution from the vocal-tract filter is posed as a SBD problem. We employ ALPA for SBD and show that excitation obtained is sparser than the excitations obtained using sparse linear prediction, smoothed `1=`2 sparse blind deconvolution algorithm, and majorization-minimization-based sparse deconvolution techniques. We also consider the problem of epoch estimation and show that epochs estimated by ALPA in both clean and noisy conditions are closer to the instants indicated by the electroglottograph when with to the estimates provided by the zero-frequency ltering technique, which is the state-of-the-art epoch estimation technique.
In the third part, we consider the problem of deconvolution of a specific class of continuous-time signals called nite-rate-of-innovation (FRI) signals, which are not bandlimited, but specified by a nite number of parameters over an observation interval. The signal is assumed to be a linear combination of delayed versions of a prototypical pulse. The reconstruction problem is posed as a 2-D SBD problem. The kernel is assumed to have a known form but with unknown parameters. Given the sampled version of the FRI signal, the delays quantized to the nearest point on the sampling grid are rst estimated using proximal-operator-based alternating `p `2 algorithm (ALPAprox), and then super-resolved to obtain o -grid (O. G.) estimates using gradient-descent optimization. The overall technique is termed OG-ALPAprox.
We show application of OG-ALPAprox to a particular modality of super-resolution microscopy (SRM), called stochastic optical reconstruction microscopy (STORM).
The resolution of the traditional optical microscope is limited by di raction and is termed as Abbe's limit. The goal of SRM is to engineer the optical imaging system to resolve structures in specimens, such as proteins, whose dimensions are smaller than the di raction limit. The specimen to be imaged is tagged or labeled with light-emitting or uorescent chemical compounds called uorophores. These compounds speci cally bind to proteins and exhibit uorescence upon excitation. The uorophores are assumed to be point sources and the light emitted by them undergo spreading due to di raction. STORM employs a sequential approach, wherein each step only a few uorophores are randomly excited and the image is captured by a sensor array. The obtained image is di raction-limited, however, the separation between the uorophores allows for localizing the point sources with high precision. The localization is performed using Gaussian peak- tting. This process of random excitation coupled with localization is performed sequentially and subsequently consolidated to obtain a high-resolution image. We pose the localization as a SBD problem and employ OG-ALPAprox to estimate the locations. We also report comparisons with the de facto standard Gaussian peak- tting algorithm and show that the statistical performance is superior. Experimental results on real data show that the reconstruction quality is on par with the Gaussian peak- tting.2018-08-28T18:30:00ZStudies on Silicone Rubber Insulators used for High Voltage Transmission
http://etd.iisc.ernet.in/2005/3981
Title: Studies on Silicone Rubber Insulators used for High Voltage Transmission
Authors: Chakraborty, Rahul
Abstract: Recently high temperature vulcanized (HTV) silicone rubber (SIR) / polymeric/composite insulators are gaining wider acceptance as overhead transmission line insulators for extra high voltage (EHV) and ultra-high voltage (UHV) systems due to some promising features like hydrophobicity recovery, light weight, ease of handling and installation, better pollution ashover performance, admirable resistance against vandalism etc. Since polymeric insula-tors are of recent origin, their long-term eld performance is yet to be understood. Owing to their organic nature, and exposure to environmental stresses like pollution, temperature, UV radiation, humidity, fog, rain etc., the insulator performance degrades over a period. The sheds/petticoats of the insulators become wettable leading to frequent ashover in humid and contaminated environment. Hence, long term reliability of the composite insulators is of foremost concern to the power utilities. The available literature on the long term eld performance of these insulators for di erent climatic conditions and under multiple environ-mental stresses for both the HTV SIR and Liquid Silicone Rubber (LSR) is scant. Also there is no reference standard for evaluation of these insulators for pollution/contamination test methods in the laboratory. However currently, CIGRE Work Group is working towards the standardization of the test methods for arti cial pollution tests for polymeric insulators. The thesis addresses some of the issues in detail.
In the first part of the thesis, a new and simple pre-treatment methodology to achieve uniform contamination layer on inherently hydrophobic HTV SIR Insulator samples is presented for laboratory pollution performance evaluation. The surface water level di usion in the dipping period is found to make the insulator surface temporarily hydrophilic. Then the uniform contamination layer is applied by dipping the sample immediately in the pollution slurry. Exhaustive experiments were conducted on full scale SIR insulators as well as SIR slabs to investigate the hydrophilicity appearance on the SIR surface. A specially fabricated arrangement for assessment of Wettability Class (WC) is made as per IEC stds. The results of WC measurement and wet ashover studies support the temporary reduction in hydrophobicity of SIR due to dipping phase in the proposed pre-treatment methodology.
The next part of the thesis presents the results for the effeect of long term thermal aging experimentation conducted on HTV SIR with difffeerent degrees of pollution (medium, heavy), the effeect of arid desert climate on polymeric insulators is studied. The experimental set-up consists of controlled HVAC source, temperature controlled furnace with a provision for high voltage (HV) and Leakage Current (LC) monitoring, a Digital Storage Oscilloscope (DSO), compact DAQ-9201 of National Instruments operated in LabVIEW platform etc. Two types of HTV SIR Insulators are considered for the study. Flat slabs as well as full-scale insulator samples of creepage length 725 mm are stressed simultaneously to simulate the in-service condition. The experimentation is conducted for about 575 hours with application of 21.0 kVrms at 60oC. The results of the hydrophobicity recovery for thermally aged contaminated polymeric insulators are reported. Besides, monitoring electrical and mechanical proper-ties, changes in material properties of SIR are also analyzed using Physiochemical analysis techniques like Fourier transform infrared (FTIR) spectroscopy, X-Ray Photoelectron Spectroscopy (XPS), Scanning Electron Microscopy (SEM), Thermo-Gravimetric Analysis (TGA), Differential Scanning Calorimetry (DSC). Some of the key findings of the study are increased surface oxidation, surface roughness and mechanical stress due to thermal aging of polymeric insulators. Experimental investigations show that the characteristics of power frequency component of leakage current can be linked with thermal aging of SIR.
Further, a unique climatic aging experimental facility is established to evaluate the long-term reliability of SIR under environmental stresses like UV, Humidity, temperature and applied electric stress. The investigations are conducted on two different types of HTV SIR and LSR at samples as well as full-scale insulator samples. The experimentation is conducted for 500 hours with 10.0 kVrms at 50oC, with 85% humidity and 1 W/m2 UV ir-radiation which is in accordance with the aging cycle specified in IEC standard. The results of the comparative studies conducted for the electrical, mechanical and material properties indicate leakage current pulses, brittleness, Salt deposition for multistress aged samples.
In summary, an attempt has been made to contribute a pollution methodology with sim-ple pre-treatment technique for inherently hydrophobic HTV SIR surface to achieve better uniformity of contamination layer. Also, electro-thermal and multiple stresses investigations were conducted for long term performance on polymeric insulators.2018-08-19T18:30:00ZOn a Divide-and-Conquer Approach for Sensor Network Localization
http://etd.iisc.ernet.in/2005/3982
Title: On a Divide-and-Conquer Approach for Sensor Network Localization
Authors: Sanyal, Rajat
Abstract: Advancement of micro-electro-mechanics and wireless communication have proliferated the deployment of large-scale wireless sensor networks. Due to cost, size and power constraints, at most a few sensor nodes can be equipped with a global positioning system; such nodes (whose positions can be accurately determined) are referred to as anchors. However, one can deter-mine the distance between two nearby sensors using some form of local communication. The problem of computing the positions of the non-anchor nodes from the inter-sensor distances and anchor positions is referred as sensor network localization (SNL).
In this dissertation, our aim is to develop an accurate, efficient, and scalable localization algorithm, which can operate both in the presence and absence of anchors. It has been demon-strated in the literature that divide-and-conquer approaches can be used to localize large net-works without compromising the localization accuracy. The core idea with such approaches is to partition the network into overlapping subnetworks, localize each subnetwork using the available distances (and anchor positions), and finally register the subnetworks in a single coordinate system. In this regard, the contributions of this dissertation are as follows:
We study the global registration problem and formulate a necessary “rigidity” condition for uniquely recovering the global sensor locations. In particular, we present a method for efficiently testing rigidity, and a heuristic for augmenting the partitioned network to enforce rigidity.
We present a mechanism for partitioning the network into smaller subnetworks using cliques. Each clique is efficiently localized using multidimensional scaling.
Finally, we use a recently proposed semidefinite program (SDP) to register the localized subnetworks. We develop a scalable ADMM solver for the SDP in question.
We present simulation results on random and structured networks to demonstrate the pro-posed methods perform better than state-of-the-art methods in terms of run-time, accuracy, and scalability.2018-08-19T18:30:00ZEffective Characterization of Sequence Data through Frequent Episodes
http://etd.iisc.ernet.in/2005/3969
Title: Effective Characterization of Sequence Data through Frequent Episodes
Authors: Ibrahim, A
Abstract: Pattern discovery is an important area of data mining referring to a class of techniques designed for the extraction of interesting patterns from the data. A pattern is some kind of a local structure that captures correlations and dependencies present in the elements of the data. In general, pattern discovery is about finding all patterns of `interest' in the data and a popular measure of interestingness for a pattern is its frequency of occurrence in the data. Thus the problem of frequent pattern discovery is to find all patterns in the data whose frequency of occurrence exceeds some user defined threshold. However, frequency of a pattern is not the only measure for finding
patterns of interest and there also exist other measures and techniques for finding
interesting patterns.
This thesis is concerned with efficient discovery of inherent patterns from long
sequence (or temporally ordered) data. Mining of such sequentially ordered data is
called temporal data mining and the temporal patterns that are discovered from large
sequential data are called episodes. More specifically, this thesis explores efficient
methods for finding small and relevant subsets of episodes from sequence data that
best characterize the data. The thesis also discusses methods for comparing datasets,
based on comparing the sets of patterns representing the datasets.
The data in a frequent episode discovery framework is abstractly viewed as a single
long sequence of events. Here, the event is a tuple, (Ei; ti), where Ei is referred to as an event-type (taking values from a finite alphabet set) and ti is the time of occurrence.
The events are ordered in the non-decreasing order of the time of occurrence. The
pattern of interest in such a sequence is called an episode, which is a collection of
event-types with a partial order defined over it. In this thesis, the focus is on a special
type of episode called serial episode, where there is a total order defined among the
collection of event-types representing the episode. The occurrence of an episode is
essentially a subset of events from the data whose event-types match the set of eventtypes
associated with the episode and the order in which they occur conforms to the underlying partial order of the episode. The frequency of an episode is some measure of how often it occurs in the event stream. Many different notions of frequency have been defined in literature. Given a frequency definition, the goal of frequent episode discovery is to unearth all episodes which have a frequency greater than a user-defined threshold. The size of an episode is the number of event-types in the episode. An episode β is called a subepisode of another episode β, if the collection of event-types of β is a subset of the corresponding collection of α and the event-types of β satisfy the same partial order relationships present among the corresponding event-types of α.
The set of all episodes can be arranged in a partial order lattice, where each level
i contains episodes of size i and the partial order is the subepisode relationship. In
general, there are two approaches for mining frequent episodes, based on the way one
traverses this lattice. The first approach is to traverse this lattice in a breadth-first
manner, and is called the Apriori approach. The other approach is the Pattern growth
approach, where the lattice is traversed in a depth-first manner. There exist different frequency notions for episodes, and many Apriori based algorithms have been proposed for mining frequent episodes under the different frequencies. However there do not exist Pattern-growth based methods for many of the frequency notions.
The first part of the thesis proposes new Pattern-growth methods for discovering
frequent serial episodes under two frequency notions called the non-overlapped frequency
and the total frequency. Special cases, where certain additional conditions, called the span and gap constraints, are imposed on the occurrences of the episodes are also considered. The proposed methods, in general, consist of two steps: the candidate
generation step and the counting step. The candidate generation step involves finding potential frequent episodes. This is done by following the general Pattern growth
approach for finding the candidates, which is the depth-first traversal of the lattice of all episodes. The second step, which is the counting step, involves counting the frequencies of the episodes. The thesis presents efficient methods for counting
the occurrences of serial episodes using occurrence windows of subepisodes for both
the non-overlapped and total frequency. The relative advantages of Pattern-growth
approaches over Apriori approaches are also discussed. Through detailed simulation
results, the effectiveness of this approach on a host of synthetic and real data sets
is shown. It is shown that the proposed methods are highly scalable and efficient in
runtime as compared to the existing Apriori approaches.
One of the main issues in frequent pattern mining is the huge number of frequent
patterns, returned by the discovery methods, irrespective of the approach taken to solve the problems. The second part of this thesis, addresses this issue and discusses methods of selecting a small subset of relevant episodes from event sequences. There have been a few approaches, discussed in the literature, for finding a small subset of patterns. One set of methods are information theory based methods, where patterns that provide maximum information are searched for. Another approach is the Minimum Description Length (MDL) principle based summarization schemes. Here the data is encoded using a subset of patterns (which forms the model for the data) and its occurrences. The subset of patterns that has the maximum efficiency in encoding
the data is the best representative model for the data. The MDL principle takes into
account both the encoding efficiency of the model as well as model complexity. A
method, called Constrained Serial episode Coding(CSC), is proposed based on the
MDL principle, which returns a highly relevant, non-redundant and small subset of
serial episodes. This also includes an encoding scheme, where the model representation and the encoding of the data are efficient. An interesting feature of this algorithm for isolating a small set of relevant episodes is that it does not need a user-specified threshold on frequency. The effectiveness of this method is shown on two types of data. The first is data obtained from a detailed simulator for a reconfigurable coupled conveyor system. The conveyor system consists of different intersecting paths and packages flow through such a network. Mining of such data can allow one to unearth the main paths of package
ows which can be useful in remote monitoring
and visualization of the system. On this data, it is shown that the proposed method
is able to return highly consistent sub paths, in the form of serial episodes, with
great encoding efficiency as compared to other known related sequence summarization
schemes, like SQS and GoKrimp. The second type of data consists of a collection
of multi-class sequence datasets. It is shown that the selected episodes from the proposed
method form good features in classi cation. The proposed method is compared
with SQS and GoKrimp, and it is shown that the episodes selected by this method
help in achieving better classification results as compared to other methods.
The third and nal part of the thesis discusses methods for comparing sets of patterns representing different datasets. There are many instances when one is interested in comparing datasets. For example, in streaming data, one is interested in knowing whether the characteristics of the data are the same or have changed significantly.
In other cases, one may simply like to compare two datasets and quantify the degree
of similarity between them. Often, data are characterized by a set of patterns as
described above. Comparing sets of patterns representing datasets gives information
about the similarity/dissimilarity between the datasets. However not many measures
exist for comparing sets of patterns. This thesis proposes a similarity measure for
comparing sets of patterns which in turn aids in comparison of di erent datasets.
First, a kernel for comparing two patterns, called the Pattern Kernel, is proposed.
This kernel is proposed for three types of patterns: serial episodes, sequential patterns and itemsets. Using this kernel, a Pattern Set Kernel is proposed for comparing
different sets of patterns. The effectiveness of this kernel is shown in classification and
change detection. The thesis concludes with a summary of the main contributions and some suggestions for extending the work presented here.2018-08-13T18:30:00Z