<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <title>etd@IISc Collection:</title>
  <link rel="alternate" href="http://hdl.handle.net/2005/10" />
  <subtitle />
  <id>http://hdl.handle.net/2005/10</id>
  <updated>2013-06-19T20:24:10Z</updated>
  <dc:date>2013-06-19T20:24:10Z</dc:date>
  <entry>
    <title>Promoter Prediction In Microbial Genomes Based On DNA Structural Features</title>
    <link rel="alternate" href="http://hdl.handle.net/2005/1934" />
    <author>
      <name>Rangannan, Vetriselvi</name>
    </author>
    <id>http://hdl.handle.net/2005/1934</id>
    <updated>2013-02-25T05:58:18Z</updated>
    <published>2013-02-24T18:30:00Z</published>
    <summary type="text">Title: Promoter Prediction In Microbial Genomes Based On DNA Structural Features
Authors: Rangannan, Vetriselvi
Abstract: Promoter region is the key regulatory region, which enables the gene to be&#xD;
transcribed or repressed by anchoring RNA polymerase and other transcription factors, but it is difficult to determine experimentally. Hence an in silico identification of promoters is crucial in order to guide experimental work and to pin point the key region that controls the transcription initiation of a gene. Analysis of various genome sequences in the vicinity of experimentally identified transcription start sites (TSSs) in prokaryotic as well as eukaryotic genomes had earlier indicated that they have several structural features in common, such as lower stability, higher curvature and less bendability, when compared with their neighboring regions. In this thesis work, the variation observed for these DNA sequence dependent structural properties have been used to identify and delineate promoter regions from other genomic regions. Since the number of bacterial genomes being sequenced is increasing very rapidly, it is crucial to have procedures for rapid and reliable annotation of their functional elements such as promoter regions, which control the expression of each gene or each transcription unit of the genome. The thesis work addresses this requirement and presents step by step protocols followed to get a generic method for promoter prediction that can be applicable across organisms. The each paragraph below gives an overall idea about the thesis organization into chapters.&#xD;
An overview of prokaryotic transcriptional regulation, structural polymorphism&#xD;
adapted by DNA molecule and its impact on transcriptional regulation has been&#xD;
discussed in introduction chapter of this thesis (chapter 1).&#xD;
Standardization of promoter prediction&#xD;
methodology - Part I&#xD;
Based on the difference in stability between neighboring upstream and downstream regions in the vicinity of experimentally determined transcription start sites, a promoter prediction algorithm has been developed to identify prokaryotic promoter sequences in whole genomes. The average free energy (E) over known promoter sequences and the difference (D) between E and the average free energy over the random sequence generated using the downstream region of known TSS (REav) are used to search for promoters in the genomic sequences. Using these cutoff values to predict promoter regions across entire E. coli genome, a reliability of 70% has been achieved, when the predicted promoters were cross verified against the 960 transcription start sites (TSSs) listed in the Ecocyc database. Reliable promoter prediction is obtained when these genome specific threshold values were used to search for promoters in the whole E. coli genome sequence. Annotation of the whole E. coli genome for promoter region has been carried out with 49% accuracy.&#xD;
Reference&#xD;
Rangannan, V. and Bansal, M. (2007) Identification and annotation of promoter regions inmicrobial genome sequences on the basis of DNA stability. J Biosci, 32, 851-862.&#xD;
Standardization of promoter prediction methodology - Part II&#xD;
In this chapter, it has been demonstrated that while the promoter regions are&#xD;
in general less stable than the flanking regions, their average free energy varies&#xD;
depending on the GC composition of the flanking genomic sequence. Therefore, a set of free energy threshold values (TSS based threshold values), from the genomic DNA with varying GC content in the vicinity of experimentally identified TSSs have been obtained. These threshold values have been used as generic criteria for predicting promoter regions in E. coli and B. subtilis and M. tuberculosis genomes, using an in-house developed tool ‘PromPredict’. On applying it to predict promoter regions corresponding to the 1144 and 612 experimentally validated TSSs in E. coli (genome %GC : 50.8) and B. subtilis (genome %GC : 43.5) sensitivity of 99% and 95% and precision values of 58% and 60%, respectively, were achieved. For the limited data set of 81 TSSs available for M. tuberculosis (65.6% GC) a sensitivity of 100% and precision of 49% was obtained.&#xD;
Reference&#xD;
Rangannan, V. and Bansal, M. (2009) Relative stability of DNA as a generic&#xD;
criterion for promoter prediction: whole genome annotation of microbial&#xD;
genomes with varying nucleotide base composition. Mol Biosyst, 5, 1758 -&#xD;
1769.&#xD;
Standardization of promoter prediction&#xD;
methodology - Part III&#xD;
In this chapter, the promoter prediction algorithm and the threshold values have&#xD;
been improved to predict promoter regions on a large scale over 913 microbial&#xD;
genome sequences. The average free energy (AFE) values for the promoter regions as well as their downstream regions are found to differ, depending on their GC content even with respect to translation start sites (TLSs) from 913 microbial genomes. The TSS based cut-off values derived in chapter 3 do not have cut-off values for both extremes of GC-bins at 5% interval. Hence, threshold values have been derived from a subset of translation start sites (TLSs) from all microbial genomes which were categorized based on their GC-content. Interestingly the cut-off values derived with respect to TSS data set (chapter 3) and TLS data set are very similar for the in-between GC-bins. Therefore, TSS based cut-off values derived in chapter 2 with the TLS based cut-off values have been combined (denoted as TSS-TLS based cutoff values) to predict promoters over the complete genome sequences. An average recall value of 72% (which indicates the percentage of protein and RNA coding genes with predicted promoter regions assigned to them) and precision of 56% is achieved over the 913 microbial genome dataset. These predicted promoter regions have been given a reliability level (low, medium, high, very high and highest) based on the difference in its relative average free energy, which can help the users design their experiments with more confidence by using the predictions with higher reliability levels.&#xD;
Reference&#xD;
Rangannan, V. and Bansal, M. (2010) High Quality Annotation of Promoter&#xD;
Regions for 913 Bacterial Genomes. Bioinformatics, 26, 3043-3050.&#xD;
Web applications&#xD;
PromBase : The predicted promoter regions for 913 microbial genomes were&#xD;
deposited into a public domain database called, PromBase which can serve as a&#xD;
valuable resource for comparative genomics study for their general genomic features and also help the experimentalist to rapidly access the annotation of the promoter regions in any given genome. This database is freely accessible for the users via the World Wide Web http://nucleix.mbu.iisc.ernet.in/prombase/.&#xD;
EcoProm : EcoProm is a database that can identify and display the potential&#xD;
promoter regions corresponding to EcoCyc annotated TSS and genes. Also displays predictions for whole genomic sequence of E. coli and EcoProm is available at&#xD;
http://nucleix.mbu.iisc.ernet.in/ecoprom/index.htm.&#xD;
PromPredict : The generic promoter prediction methodology described in previous chapters has been implemented in to an algorithm ‘PromPredict’ and available at&#xD;
http://nucleix.mbu.iisc.ernet.in/prompredict/prompredict.html.&#xD;
Analysing the DNA structural characteristic of prokaryotic promoter sequences for their&#xD;
predominance&#xD;
Sequence dependent structural properties and their variation in genomic DNA are important in controlling several crucial processes such as transcription, replication, recombination and chromatin compaction. In this chapter 6, quantitative analysis of sequences motifs as well as sequence dependent structural properties, such as curvature, bendability and stability in the upstream region of TSS and TLS from E. coli, B. subtilis and M. tuberculosis has been carried out in order to assess their predictive power for promoter regions. Also the correlation between these structural properties and GC-content has been investigated. Our results have shown that AFE values (stability) gives finer discrimination rather than %GC in identifying promoter regions and stability have shown to be the better structural property in delineating promoter regions from non-promoter regions. &#xD;
Analysis of these DNA structural properties has been carried out in human&#xD;
promoter sequences and observed to be correlating with the inactivation status of&#xD;
the X-linked genes in human genome. Since, it is deviating from the theme of main thesis; this chapter has been included as appendix A to the main thesis.&#xD;
General conclusion&#xD;
Stability is the ubiquitous DNA structural property seen in promoter regions. Stability shows finer discrimination for promoter prediction rather than directly using %GC-content. Based on relative stability of DNA, a generic promoter prediction algorithm has been developed and implemented to predict promoter regions on a large scale over 913 microbial genome sequences. The analysis of the predicted regions across organisms showed highly reliable predictive performance of the algorithm.</summary>
    <dc:date>2013-02-24T18:30:00Z</dc:date>
  </entry>
  <entry>
    <title>Protein-DNA Graphs And Interaction Energy Based Protein Structure Networks</title>
    <link rel="alternate" href="http://hdl.handle.net/2005/1904" />
    <author>
      <name>Vijayabaskar, M S</name>
    </author>
    <id>http://hdl.handle.net/2005/1904</id>
    <updated>2013-01-29T05:51:15Z</updated>
    <published>2013-01-28T18:30:00Z</published>
    <summary type="text">Title: Protein-DNA Graphs And Interaction Energy Based Protein Structure Networks
Authors: Vijayabaskar, M S
Abstract: Proteins orchestrate a number of cellular processes either alone or in concert with other biomolecules like nucleic acids, carbohydrates, and lipids. They exhibit an intrinsic ability to fold de novo to their functional states. The three–dimensional structure of a protein, dependent on its amino acid sequence, is important for its function. Understanding this sequence– structure–function relationship has become one of the primary goals in biophysics. Various experimental techniques like X–ray crystallography, Nuclear Magnetic Resonance (NMR), and site–directed mutagenesis have been used extensively towards this goal. Computational studies include mainly sequence based, and structure based approaches. The sequence based approaches such as sequence alignments, phylogenetic analysis, domain identification, statistical coupling analysis etc., aim at deriving meaningful information from the primary sequence of the protein. The structure based approaches, on the other hand, use structures of folded proteins. Recent advances in structure determination and efforts by various structural consortia have resulted in an enormous amount of structures available for analysis. Innumerable observations such as the allowed and disallowed regions in the conformations of a peptide unit, hydrophobic core in globular proteins, existence of regular secondary structures like helices, sheets, and turns and a limited fold space have been landmarks in understanding the characteristics of protein structures.  &#xD;
The uniqueness of protein structure is attained through non–covalent interactions among the constituent amino acids. Analyses of protein structures show that different types of non–covalent interactions like hydrophobic interactions, hydrogen bonding, salt bridges, aromatic stacking, cation–π interactions, and solvent interactions hold protein structures together. Although such structure analyses have provided a wealth of information, they have largely been performed at a pair–wise level and an investigation involving such pair–wise interactions alone is not sufficient to capture all the determinants of protein structures, since they happen at a global level. This consideration has led to the development of graphs/networks for proteins.  &#xD;
Graphs or Networks are a collection of nodes connected by edges. Protein Structure Networks (PSNs) can be constructed using various definitions of nodes and edges. Nodes may vary from atoms to secondary structures in Synopsis proteins, and the edges can range from simple atom–atom distances to distance between secondary structures. To study the interplay of amino acids in structure formation, the most commonly used PSNs consider amino acids as nodes. The criterion for edge definition, however, varies. PSNs can be constructed at a course grain level by considering the distances between Cα/Cβ atoms, any side–chain atoms, or the centroids of the amino acids. At a finer level, PSNs can be constructed using atomic details by considering the interaction types or by computing the extent of interaction between amino acids. &#xD;
Representation of proteins as networks and their analyses has given us a unique perspective on various aspects such as protein structure organization, stability, folding, function, oligomerization and so on. A variety of network properties like the degree distribution, clustering coefficient, characteristic path lengths, clusters, and hubs have been investigated. Most of these studies are carried out on protein structures alone. However, the interaction of proteins with other biopolymers like nucleic acids is vital for many crucial biological processes like transcription and translation. In this thesis, we have attempted to address this problem by constructing and analyzing combined graphs of the structures of protein and DNA. Also, in almost all of the PSN studies, the connections have been made solely on the basis of geometric criteria. In the later part of the thesis, we have taken PSN a step further by defining the non–covalent connections based on chemical considerations in the form of the energies of interactions. &#xD;
The thesis contains two sections. The first part mainly involves the construction and application of PSNs to study DNA binding proteins. The DNA binding proteins are involved in several high fidelity processes like DNA recombination, DNA replication, and transcription. Although the protein– DNA interfaces have been extensively analyzed using pair–wise interactions, we gain additional global perspective from network approach. Furthermore, most of the earlier investigations have been carried out from the protein point of view (protein centric) and the present network approach aims to combine both the protein centric and the DNA centric view points by construction and analyses of protein–DNA graphs. These studies are described in Chapters 3 and 4. The second part of the thesis discusses the development, characterization, and application of protein structure networks based on non– covalent interaction energies. The investigations are presented in chapters 5 and 6. Chapter 3 discusses the development of Protein–DNA Graphs (PDGs) where the protein–DNA interfaces are represented as networks. PDG is a bipartite network in which amino acids form a set of nodes and the nucleotides form the other set. The extent of interaction between the two diverse types of biopolymers is normalized to define the strength of interaction. Edges are then constructed based on the interaction strength between amino acids and nucleotides. Such a representation, reported here for the first time, provides a holistic view of the interacting surface.  &#xD;
&#xD;
The developed PDGs are further analyzed in terms of clusters of interacting residues and identification of highly connected residues, known as hubs, along the protein–DNA interface and discussed in terms of their interacting motifs. Important clusters have been identified in a set of protein–DNA complexes, where the amino acids interact with different chemical components of DNA such as phosphate, deoxyribose and base with varying degrees of connectivity. An analysis of such fragment based PDGs provided insights into the nature of protein–DNA interaction, which could not have been obtained by conventional pair–wise analysis. The predominance of deoxyribose–amino acid clusters in beta–sheet proteins, distinction of the interface clusters in helix–turn–helix and the zipper type proteins are some of the new findings from the analysis of PDGs. Additionally, a potential classification scheme has been proposed for protein–DNA complexes on the basis of their interface clusters. This classification scheme gives a general idea of how the proteins interact with different components of DNA in various complexes. The present graph–based method has provided a deeper insight into the analysis of the protein–DNA recognition mechanisms from both protein and DNA view points, thus throwing more light on the nature and specificity of these interactions (Sathyapriya, Vijayabaskar et al. 2008). &#xD;
Chapter 4 delineates the application of PSN to an important problem in molecular biology. An analysis of interface clusters from multimeric proteins provides a clue to the important residues contributing to the stability of the oligomers. One such prediction was made on the DNA binding protein under starvation from Mycobacterium smegmatis (Ms–Dps) using PSNs. Two types of trimers, Trimer A (tA) and Trimer B (tB) can be derived from the dodecamer because of the inherent three fold symmetry of the spherical crystal structure. The irreversible dodecamerization of these native Ms--Dps trimers, in vitro, is known to be directly associated with the bimodal function (DNA binding and iron storage) of this protein. Interface clusters which were Synopsis identified from the PSNs of the derived trimers, allowed us to convincingly predict the residues E146 and F47 for mutation studies. The prediction was followed up by our experimental collaborators (Rakhi PC and Dipankar Chatterji), which led to the elucidation of the molecular mechanism behind the in vitro oligomerization of Ms--Dps. The F47E mutant was impaired in dodecamerization, and the double mutant (E146AF47E) was a native monomer in solution. These two observations suggested that the two trimers are important for dodecamerization and that the residues selected are important for the structural stability of the protein in vitro. From the structural and functional characterizations of the mutants, we have proposed an oligomerization pathway of Ms–Dps (Chowdhury, Vijayabaskar et al. 2008). &#xD;
The second part of the thesis involves the development, characterization (Chapter 5) and application (Chapter 6) of Protein Energy Networks (PENs). As mentioned above, the PSNs constructed on the geometric basis efficiently capture the topology and associated properties at the level of atom–atom contact. The chemistry, however, is not completely captured by these network representations, and a wealth of information can be extracted by incorporating the details of chemical interactions. This study is an advancement over the existing PSNs, in terms of edges being defined on the basis of interaction energies among the amino acids. This interaction energy is the resultant of various types of interactions within a protein. Use of such realistic interaction energies in a weighted network captures all the essential features responsible for maintaining the protein structure. &#xD;
The methodology involved in representing proteins as interaction energy weighted networks, with realistic edge weights obtained from standard force fields is described in Chapter 5. The interaction energies were derived from equilibrium ensembles (obtained using molecular dynamics simulations) to account for the structural plasticity, which is essential for function elucidation. The suitability of this method to study single static structures was validated by obtaining interaction energies on minimized crystal structures of proteins. The PENs were then characterized using network parameters like edge weight distributions, clusters, hubs, and shortest paths. The PENs exhibited three distinct behaviors in terms of the size of the largest connected cluster as a function of interaction energy; namely, the pre–transition, transition, and post transition regions, irrespective of the topology of the proteins. The pre– transition region (energies&lt;–20 kJ/mol) comprises smaller clusters with mainly charged and polar residues as hubs. Crucial topological changes take place in the transition region (–10 to –20 kJ/mol), where the smaller clusters aggregate, through low energy van der Waals interactions, to form a single large cluster in the post–transition region (energies&gt;–10 kJ/mol). These behaviors reinforce the concept that hydrophobic interactions hold together local clusters of highly interacting residues, keeping the protein topology intact (Vijayabaskar and Vishveshwara 2010). &#xD;
The applications of PENs in studying protein organization, allosteric communication, thermophilic stability and the structural relation of remote homologues of TIM barrel families have been outlined in Chapter 6.  &#xD;
In the first case, the weighted networks were used to identify stabilization regions in protein structures and hierarchical organization in the folded proteins, which may provide some insights into the general mechanism of protein folding and stabilization (Vijayabaskar and Vishveshwara 2010). In the second case the features of communication paths in proteins were elucidated from PENs, and specific paths have been extensively discussed in the case of PDZ domain, which is known to bring together protein partners, mediating various cellular processes. Changes in PEN upon ligand binding, resulting in alterations of the shortest paths (energetically most favorable paths) for a small fraction of residues, indicated that allosteric communication is anisotropic in PDZ. The observations also establish that the shortest paths between functionally important sites traverse through key residues in PDZ2 domain. Furthermore, shortest paths in PENs provide us the exact pathways of communication between residues. Although the communication in PDZ has been extensively investigated, detailed information of pathways at the energy level has emerged for the first time from the present study from PEN analysis (Vijayabaskar and Vishveshwara 2010). In the third case, a set of thermophilic and mesophilic proteins were compared to determine the factors responsible for their thermal stability from a network perspective using PENs. The sub– graph parameters such as cluster population, hubs and cliques were the prominent contributing factors for thermal stability. Also, the thermophilic proteins have a better–packed hydrophobic core. The property of thermophilic protein to increase stability by increasing the connectivity but retain conformational flexibility is discussed from a cliques and communities (higher order inter–connection of residues) perspective (Vijayabaskar and Vishveshwara 2010). Finally, the remote homologues from the TIM barrel fold have been analyzed using PENs to identify the interactions responsible for the maintenance of the fold despite low sequence similarity. A study of conserved Synopsis interactions in family specific PENs reveals that the formation of the central beta barrel is vital for the TIM barrel formation. The beta barrel is being formed by either conserved long range electrostatic interactions or by tandem arrangement of low energy hydrophobic interactions. The contributions of helix–sheet and helix–helix interactions are not conserved in the families. This study suggests that the sequentially near residues forming the helix–sheet interactions are common in many folds and hence formed despite non– conservation, whereas formation of beta barrel requires long range interactions, thus more conserved within the families. &#xD;
The thesis also consists of an appendix in which a web–tool, developed to express proteins as networks and analyze these networks using different network parameters is discussed. The web based program–GraProStr allows us to represent proteins as structure graphs/networks by considering the amino acid residues as nodes and representing non–covalent interactions among them as edges. The different networks (classified based on edge definition) which can be obtained using GraProStr are Protein Side–chain Networks (PScNs), Cα/Cβ distance based networks (PcNs) and Protein– Ligand Networks (PLNs). The parameters which can be generated include clusters, hubs, cliques (rigid regions in proteins) and communities (group of cliques). It is also possible to differentiate the above mentioned parameters for monomers and interfaces in multimeric proteins. The well tested tool is now made available to the scientific community for the first time. GraProStr is available online and can be accessed from http://vishgraph.mbu.iisc.ernet.in/GraProStr/index.html. With a variety of structure networks, and a set of easily interpretable network parameters GraProStr can be useful is analyzing protein structures from a global paradigm (Vijayabaskar, Vidya et al. 2010). &#xD;
In summary, we have extensively studied DNA binding proteins using side– chain based protein structure networks and by integrating the DNA molecule into the network. Also, we have upgraded the existing methodology of generating structure networks, by representing both the geometry and the chemistry of residues as interaction energies among them. Using this energy based network we have studied diverse problems like protein structure formation, stabilization, and allosteric communication in detail. The above mentioned methodologies are a considerable advancement over existing structure network representations and have been shown in this thesis to shed more light on the structural features of proteins.</summary>
    <dc:date>2013-01-28T18:30:00Z</dc:date>
  </entry>
  <entry>
    <title>Topology-based Sequence Design For Proteins Structures And Statistical Potentials Sensitive To Local Environments</title>
    <link rel="alternate" href="http://hdl.handle.net/2005/1886" />
    <author>
      <name>Jha, Anupam Nath</name>
    </author>
    <id>http://hdl.handle.net/2005/1886</id>
    <updated>2013-01-17T10:16:36Z</updated>
    <published>2013-01-16T18:30:00Z</published>
    <summary type="text">Title: Topology-based Sequence Design For Proteins Structures And Statistical Potentials Sensitive To Local Environments
Authors: Jha, Anupam Nath
Abstract: Proteins, which regulate most of the biological activities, perform their functions through their unique three-dimensional structures. The folding process of this three dimensional structure from one dimensional sequence is not well understood. The available facts infer that the protein structures are mostly conserved while sequences are more tolerant to mutations&#xD;
i.e. a number of sequences can adopt the same fold. These arch of optimal sequences for a chosen conformation is known as inverse protein folding and this thesis takes this approach to solve the enigmatic problem.&#xD;
This thesis presents a protein sequence design method based on the native state topology of protein structure. The structural importance of the amino acid positions has been converted into the topological parameter of the protein conformation. This scheme of extraction of topology of structures has been successfully applied on three dimensional lattice structures and in turn sequences with minimum energy for a given structure are obtained. This technique along with the reduced amino cid alphabet(A reduced amino acid alphabet is any clustering of twenty amino acids based on some measure of the irrelative similarity) has been applied on the protein structures and hence designed optimal amino acid sequences for a given structure. These designed sequences are energetically much better than the native amino acid sequence. The utility of this method is further confirmed by showing the similarity between naturally occurring and the designed sequences. In summary, a computationally efficient method of designing optimal sequences for a given structure is given.&#xD;
The physical interaction energy between the amino acids is an important part of study of protein-protein interaction, structure prediction, modeling and docking etc. The local environment of amino acids makes a difference between the same amino acid pairs in the protein structure and so the pair-wise interaction energy of amino acid residues should depend on the irrespective environment. A local environment depended knowledge based potential energy function is developed in this thesis. Two different environments, one of these is the local degree (number of contacts) and the other is the secondary structural element of amino acids, have been considered. The investigations have shown that the environment-based interaction preferences for amino acids is able to provide good potential energy functions which perform exceedingly well in discriminating the native structure from the  structures with random interactions.&#xD;
Further, the membrane proteins are located in a completely different physico-chemical environment with different amino acid composition than the water soluble proteins. This work provides reliable potential energy functions which take care of different environment for the investigation(model/predict) of the structure of helical membrane proteins. Three different environments, parallel and perpendicular to the lipid bilayer and number of amino acid contacts, are explored to analyze the environmental effects on the potential functions. These environment dependent scoring functions perform exceedingly well indiscriminating the native sequence from a set of random sequences.&#xD;
Hydrophobicity of amino acids is a measure of buriedness or exposure to the aqueous environment. The lack of uniformity within the protein  environment gives rise to the different values of hydrophobicity for the same amino acids, which completely depends on its location inside the   protein.The contact based environment dependent hydrophobicity values of all amino acids, separately for globular and membrane proteins, have also been evaluated in this thesis.&#xD;
Apart from developing scoring functions, the packing of helices in membrane proteins is investigated by an approach based on the local backbone geometry and side chain atom-atom contacts of amino acids. A parameter defined in this study is able to capture the essential features of inter-helical packing, which may prove to be useful in modeling of helical membrane proteins.&#xD;
In conclusion, this thesis has described a novel technique to design the energetically minimized amino acid sequences which can fold in to a given conformation. Also the environment dependent interaction preference of amino acids in globular proteins is captured an efficient manner. Specially, the environment dependent scoring function for helical membrane proteins is a first successful attempt in this direction.</summary>
    <dc:date>2013-01-16T18:30:00Z</dc:date>
  </entry>
  <entry>
    <title>Structural Studies On Mycobacterium Tuberculosis Pantothenate Kinase (PanK)</title>
    <link rel="alternate" href="http://hdl.handle.net/2005/1304" />
    <author>
      <name>Chetnani, Bhaskar</name>
    </author>
    <id>http://hdl.handle.net/2005/1304</id>
    <updated>2011-07-18T10:13:44Z</updated>
    <published>2011-07-17T18:30:00Z</published>
    <summary type="text">Title: Structural Studies On Mycobacterium Tuberculosis Pantothenate Kinase (PanK)
Authors: Chetnani, Bhaskar
Abstract: Pantothenate kinase (PanK) is an ubiquitous and essential enzyme that catalyzes the first step in the universal Coenzyme (CoA) biosynthesis pathway. In this step, pantothenate (Vitamin B5) is converted to 4′-phosphopantothenate, which subsequently forms CoA in four enzymatic steps. In bacteria, three types of PanK’s have been identified which exhibit wide variations in their distribution, mechanisms of regulation and affinity for substrates. Type I PanK is a key regulatory enzyme in the CoA biosynthesis pathway and its activity is feedback regulated by CoA and its thioesters. As part of a major programme on mycobacterial proteins in this laboratory, structural studies on type I PanK from Mycobacterium tuberculosis (MtPanK) was initiated and the structure of this enzyme in complex with a CoA derivative has been reported earlier. To further elucidate the structural basis of the enzyme action of MtPanK, several crystal structures of the enzyme in complex with different ligands have been determined in the present study. In conjunction to this, solution studies on the enzyme were also carried out.    &#xD;
The structures were solved using the well-established techniques of protein X-ray crystallography. The hanging drop vapour diffusion method was used for crystallization in all cases. The X-ray intensity data were collected using a MAR Research imaging plate system mounted on a Rigaku RU200 and Bruker-AXS Microstar Ultra II rotating anode X-ray generator. The data were processed using the HKL and MOSFLM and SCALA from the CCP4 suite. The structures were solved by the molecular replacement method using the program AMoRe and PHASER. Structure refinements were carried out using the programs CNS and REFMAC. Model building &#xD;
was carried out using COOT and the refined structures were validated using PROCHECK and MOLPROBITY. Secondary structure was assigned using DSSP, structural superpositions were made using ALIGN and buried surface area was calculated using NACCESS. Solution studies on CoA binding and catalytic activity were carried out using Isothermal titration calorimetry (ITC). &#xD;
To start with, the crystal structures of the complexes of MtPanK were determined with (a) citrate, (b) the non-hydrolysable ATP analog AMPPCP and pantothenate (initiation complex), (c) ADP and phosphopantothenate resulting from phosphorylation of pantothenate by ATP in the crystal (end complex), (d)  ATP and ADP, each with half occupancy, resulting from a  quick soak of crystals in ATP (intermediate complex), (e) CoA, (f) ADP prepared by soaking and co-crystallization, which turned out to have identical structures and (g) ADP and pantothenate. Unlike in the case of the homologous E.coli enzyme (EcPanK), AMPPCP and ADP occupied different, though overlapping, locations in the respective complexes; the same was true of pantothenate in the initiation complex and phosphopantothenate in the end complex. The binding site of MtPanK was found to be substantially preformed while that of EcPanK exhibited considerable plasticity. The difference in the behavior of the E.coli and M.tuberculosis enzymes could be explained in terms of changes in local structure resulting from substitutions. It is unusual for two homologous enzymes to exhibit such striking differences in action and the changes in the locations of ligands exhibited by M.tuberculosis pantothenate kinase are remarkable and novel.  &#xD;
  The movement of ligands exhibited by MtPanK during enzyme action appeared to indicate that the binding site of the enzyme was less specific for a particular type of ligand than EcPanK. Kinetic measurements of enzyme activity showed that MtPanK had dual substrate specificity for ATP and GTP, unlike the enzyme from E.coli which showed a much higher specificity for ATP. A molecular explanation for the difference in the specificities of the two homologous enzymes was provided by the crystal structures of the complexes of the M. tuberculosis enzyme with (1) GMPPCP and pantothenate (2) GDP and phosphopantothenate (3) GDP (4) GDP and pantothenate (5) AMPPCP and (6) GMPPCP and the structures of the complexes of the two enzymes involving CoA and different adenyl nucleotides. The explanation was substantially based on two critical substitutions in the amino acid sequence and the local conformational change resulting from them. Dual specificity of the type exhibited by this enzyme is rare and so are the striking difference between two homologous enzymes in the geometry of the binding site, locations of ligands and specificity.   &#xD;
 The crystal structures of MtPanK in binary complexes with nucleoside diphosphate (NDP) and nucleoside triphosphate (NTP) provided insights about the natural location and conformation of nucleotides. In the absence of pantothenate, the NDP and the NTP bound with an extended conformation at the same site. In the presence of pantothenate, as seen in the initiation complexes, the NTP had a closed conformation and an altered location. However, the effect of the nucleotide on the conformation and the location of pantothenate were yet to be elucidated as the natural location of the ligand in MtPanK was not known. This lacuna was sought to be filled through X-ray analysis of the binary complexes of MtPanK with pantothenate and two of its derivatives, namely, pantothenol and N-nonyl pantothenamide (N9-Pan). These structures demonstrated that pantothenate, with a somewhat open conformation occupied a location similar to that occupied by phosphopantothenate in the “end” complexes, which was distinctly different from the location of pantothenate in “closed” conformation in the ternary “initiation” complexes. The conformation and the location of the nucleotide were also different in the initiation and end complexes. An invariant arginine appeared to play a critical role in the movement of ligand that took place during enzyme action. The structure analysis of the binary complexes with the vitamin and its derivatives completed the description of the locations and conformations of nucleoside di and triphosphates and pantothenate in different binary and ternary complexes. These complexes provide snapshots of the course of action of MtPanK.</summary>
    <dc:date>2011-07-17T18:30:00Z</dc:date>
  </entry>
</feed>

