<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel rdf:about="http://hdl.handle.net/2005/1">
    <title>etd@IISc Community:</title>
    <link>http://hdl.handle.net/2005/1</link>
    <description />
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://hdl.handle.net/2005/1955" />
        <rdf:li rdf:resource="http://hdl.handle.net/2005/2013" />
        <rdf:li rdf:resource="http://hdl.handle.net/2005/1876" />
        <rdf:li rdf:resource="http://hdl.handle.net/2005/2008" />
      </rdf:Seq>
    </items>
    <dc:date>2013-06-06T13:29:21Z</dc:date>
  </channel>
  <item rdf:about="http://hdl.handle.net/2005/1955">
    <title>Compiler Transformations For Improving The Performance Of Software Transactional Memory</title>
    <link>http://hdl.handle.net/2005/1955</link>
    <description>Title: Compiler Transformations For Improving The Performance Of Software Transactional Memory
Authors: Mannarswamy, Sandya S
Abstract: Expressing synchronization using traditional lock based primitives has been found to be both error-prone and restrictive. Hence there has been considerable research work to develop scalable and programmer-friendly alternatives to lock-based synchronization. Atomic sections have been proposed as a programming idiom for expressing synchronization at a higher-level of abstraction than locks. &#xD;
One way of supporting atomic sections in software is to rely on an underlying Software Transactional Memory (STM) implementation. While STM offers the promise of being a programming paradigm which is less error-prone and more programmer friendly compared to traditional lock-based synchronization, it also needs to be competitive in performance in order for it to be adopted in mainstream software. Unfortunately STMs do not meet the performance goals and are known to incur excessive performance overheads. &#xD;
Prior work by other researchers and our performance analysis of STM applications show that conflicts and the resulting aborts are a major performance bottleneck for STM applications. Second we find that, supporting fine-grained optimistic concurrency can have significant impact on the cache behavior of applications running on STM and hence can adversely affect STM performance. Our systematic quantitative analysis of the cache behavior of STM applications as well as prior work on qualitative analysis of STM overheads show that cache overheads constitute a major performance bottleneck for STM applications. Hence in this thesis, we focus on addressing these two major STM performance bottlenecks. &#xD;
Current STM implementations are typically application unaware in that they do not &#xD;
analyze the application and use that knowledge to improve the application performance on STM. Closer integration of transactions with the programming languages opens up the possibility of using the compiler to analyze STM applications and using that knowledge to perform application code transformations to improve the application performance on STM automatically and in a manner transparent to the programmer. This motivated us to address the two major STM performance bottlenecks namely poor cache performance and performance penalty due to aborts, by compiler transformations. &#xD;
In order to pinpoint the cache bottlenecks of STM, we perform a detailed experimental evaluation of the cache behavior of STM applications and quantify the impact of the different STM factors on the cache misses experienced by the applications. We propose a set of compiler transformations targeted to address the cache performance bottlenecks identified by our analysis. Next we turn our attention to compiler analysis and transformations targeted at reducing the performance overheads due to transactional aborts, effectively utilizing the compiler’s knowledge of the data access patterns of the application. Since not all applications are designed with optimistic concurrency in mind, real world applications typically contain certain atomic sections which are not amenable to STM’s optimistic concurrency control and hence suffer from excessive transactional abort overheads. We propose two compiler techniques for handling such atomic sections. &#xD;
Another major cause of transactional conflicts leading to unnecessary aborts is the uniform granularity access tracking scheme employed by STM implementations. Using a single uniform access tracking granularity leads to poor lock assignment by STM. We propose techniques which use compiler’s knowledge of an application to improve the application unaware lock assignment made by the STM. Last as transactional abort overheads impact STM performance adversely, we propose a compiler-based approach to reduce the transactional abort overheads by reconciling certain kinds of transactions instead of aborting them and then performing a complete re-execution. We show that our combined set of compiler transformations are effective in improving the performance of a set of STAMP benchmarks by reducing the execution time by 7.48% to 54.82%, aborts by 8.98% to 56.61% and the average D-cache miss latency by up to 33.51%.</description>
    <dc:date>2013-03-26T18:30:00Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2005/2013">
    <title>Petri Net Model Based Energy Optimization Of Programs Using Dynamic Voltage And Frequency Scaling</title>
    <link>http://hdl.handle.net/2005/2013</link>
    <description>Title: Petri Net Model Based Energy Optimization Of Programs Using Dynamic Voltage And Frequency Scaling
Authors: Arun, R
Abstract: High power dissipation and on-chip temperature limit performance and affect reliability in modern microprocessors. For servers and data centers, they determine the cooling cost, whereas for handheld and mobile systems, they limit the continuous usage of these systems. For mobile systems, energy consumption affects the battery life. It can not be ignored for desktop and server systems as well, as the contribution of energy continues to go up in organizations’ budgets, influencing strategic decisions, and its implications on the environment are getting appreciated. Intelligent trade-offs involving these quantities are critical to meet the performance demands of many modern applications. &#xD;
Dynamic Voltage and Frequency Scaling (DVFS) offers a huge potential for designing&#xD;
trade-offs involving energy, power, temperature and performance of computing systems. In our work, we propose and evaluate DVFS schemes that aim at minimizing energy consumption while meeting a performance constraint, for both sequential and parallel applications.&#xD;
We propose a Petri net based program performance model, parameterized by application properties, microarchitectural settings and system resource configuration, and use this model to find energy efficient DVFS settings. We first propose a DVFS scheme&#xD;
using this model for sequential programs running on single core multiple clock domain&#xD;
(MCD) processors, and evaluate this on a MCD processor simulator. We then extend&#xD;
this scheme for data parallel (Single Program Multiple Data style) applications, and then generalize it for stream applications as well, and evaluate these two schemes on a full system CMP simulator. Our experimental evaluation shows that the proposed schemes achieve significant energy savings for a small performance degradation.</description>
    <dc:date>2013-05-28T18:30:00Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2005/1876">
    <title>Energy And Channel-Aware Power And Discrete Rate Adaptation And Access In Energy Harvesting Wireless Networks</title>
    <link>http://hdl.handle.net/2005/1876</link>
    <description>Title: Energy And Channel-Aware Power And Discrete Rate Adaptation And Access In Energy Harvesting Wireless Networks
Authors: Khairnar, Parag S
Abstract: Energy harvesting (EH) nodes, which harvest energy from the environment in order to communicate over a wireless link, promise perpetual operation of wireless networks. The primary focus of the communication system design shifts from being as energy conservative as possible to judiciously handling the randomness in the energy harvesting process in order to enhance the system performance. This engenders a significant redesign of the physical and multiple access layers of communication. &#xD;
In this thesis, we address the problem of maximizing the throughput of a system that consists of rate-adaptive EH nodes that transmit data to a common sink node. We consider the practical case of discrete rate adaptation in which a node selects its transmission power from a set of finitely many rates and adjusts its transmit power to meet a bit error rate (BER) constraint. When there is only one EH node in the network, the problem involves determining the rate and power at which the node should transmit as a function of its channel gain and battery state. For the system with multiple EH nodes, which node should be selected also needs to be determined. &#xD;
We first prove that the energy neutrality constraint, which governs the operation of an EH node, is tighter than the average power constraint. We then propose a simple rate and power adaptation scheme for a system with a single EH node and prove that its throughput approaches the optimal throughput arbitrarily closely. We then arrive at the optimal selection and rate adaptation rules for a multi-EH node system that opportunistically selects at most one node to transmit at any time. The optimal scheme is shown to significantly outperform other ad hoc selection and transmission schemes. The effect of energy overheads, such as battery storage inefficiencies and the energy required for sensing and processing, on the transmission scheme and its overall throughput is also analytically characterized. &#xD;
Further, we show how the time and energy overheads incurred by the opportunistic selection process itself affect the adaptation and selection rules and the overall system throughput. Insights into the scaling behavior of the average system throughput in the asymptotic regime, in which the number of nodes tend to infinity, are also obtained. We also optimize the maximum time allotted for selection, so as to maximize the overall system throughput. &#xD;
For systems with EH nodes or non-EH nodes, which are subject to an average power constraint, the optimal rate and power adaptation depends on a power control parameter, which hitherto has been calculated numerically. We derive novel asymptotically tight bounds and approximations for the same, when the average rate of energy harvesting is large. These new expressions are analytically insightful, computationally useful, and are also quite accurate even in the non-asymptotic regime when average rate of energy harvesting is relatively small. &#xD;
In summary, this work develops several useful insights into the design of selection and transmission schemes for a wireless network with rate-adaptive EH nodes.</description>
    <dc:date>2013-01-07T18:30:00Z</dc:date>
  </item>
  <item rdf:about="http://hdl.handle.net/2005/2008">
    <title>Distributed Wireless Networks : Link Scheduling And Application Delay Modelling</title>
    <link>http://hdl.handle.net/2005/2008</link>
    <description>Title: Distributed Wireless Networks : Link Scheduling And Application Delay Modelling
Authors: Sunny, Albert
Abstract: We address several problems that arise in a multihop wireless mesh network. First, we study the problem of joint congestion control, routing and MAC layer scheduling. We formulate the problem as an aggregate utility maximization problem and apply duality theory to decompose the problem into two sub-problems, namely, network layer congestion control and routing problem, and MAC layer scheduling problem. Given the link “prices", the source adjusts its rate based on the cost of the least-cost path to the destination, and sends traffic to the destination along the least-cost path, while link scheduling is carried out based on link prices. &#xD;
Optimal link scheduling for a wireless network is known to be NP-hard. We explore the use of a known centralized greedy heuristic, and develop a distributed algorithm that can schedule independent links based on local information. While the link scheduling algorithm above is for a given set of link prices, the solution to our problem depends on the sequence of price vectors generated by the price update algorithm. This leads us to study convergence issues related to the price update algorithm. &#xD;
Next, we develop a practical protocol which maximizes aggregate utility in a wireless mesh network. We simulate our protocol using Qualnet 4.5 and compare the result with a baseline protocol that uses IEEE 802.11 for link scheduling and AODV for routing.  &#xD;
Our proposed protocol requires the durations of slots and subslots to be defined. We develop an approach in which given a single cell wireless mesh network using IEEE 802.11 as a reliable message delivery mechanism, one can find upper and lower bounds on the durations of slots. We employ stochastic ordering to compare distributions of random variables and using some properties of stochastic ordering along with the central limit theorem, we devise a way to compute the above mentioned bounds on the durations.&#xD;
In the second part, we shift our focus to model delays incurred by application packets sent over a WLAN. In this section we model the WLAN as a Random Polling System. The packet arrival process at each node i is assumed to be a stationary and independent increment random process with mean ai and second moment a(2)i . The packet lengths at node i are assumed to be i.i.d random variables Pi with finite mean and second moment. Utilizing available results, we obtain expressions for mean packet delay. Extensive simulations are conducted to verify the analytical results.</description>
    <dc:date>2013-05-22T18:30:00Z</dc:date>
  </item>
</rdf:RDF>

