Computational and mathematical analysis of functional genetic data
Large-scale DNA sequencing has ushered in a new era in biology, making it possible to analyze species at their most atomistic genetic level and paving the way for a new era in medicine. At the same time, massive functional assays that provide data regarding protein-DNA interactions, protein-RNA interactions, gene expression, metabolomic profiling and more have become increasingly available. My lab is interested in computational and mathematical approaches to analyze such large data sources in order to understand how genomes function and to make these findings clinically relevant. We develop and use techniques from a variety of disciplines, including statistical inference, molecular evolution and biophysical modeling.
Current Lab Interests
Our lab is currently focusing on several new research areas. We are now developing projects in human and mouse genetics, cancer, epigenetics and RNA biology. Some of our more specific interests include evolutionary processes in cancer, regulatory sequences within RNA and developmental enhancers. These projects involve collaborations with experimentalists both within The Jackson Laboratory and with a number of outside groups. Some of our previous areas of study are described below.
Functions of Highly Conserved Enhancer Sequences
In a given phylogeny, comparative sequence data can be used to infer the functional sequences within genomes. Just as morphological features shared among species (e.g. all vertebrates have a spine) are likely to be important to those species, DNA sequences shared among species are likely to be functional. One of the organisms we have focused on is the model vertebrate Danio rerio, i.e. the zebrafish. Our lab collaborates with the Guo lab at UCSF to study conserved noncoding elements, sequences with conservation far beyond what would be expected by neutral mutation in vertebrate intergenic regions. For example, at a threshold of at least 50 bp and at least 50% sequence identity, there are 73187 strand-specific CNEs conserved between zebrafish and humans. We have developed computational approaches to identify and analyze CNEs and also experimentally investigated the functions of ~200 CNEs in developing zebrafish embryos (Li et al 2010). We have studied the relative importance of cis- and trans- regulatory effects on the functional behavior of enhancers (Ritter et al 2010) and have also shown experimentally that transcriptional enhancers can be embedded within coding sequences of vertebrate genes (Ritter et al 2012).
Functions Contained in Coding Sequences
We are actively exploring the functions and neutral evolutionary behavior of synonymous sites in coding sequences (Chuang and Li 2004; Chin et al 2005). We have shown for example that coding sequences are replete with binding sites for microRNAs, as well as other types of functional sequences such as exonic splicing enhancers. Such sites exhibit a strong selective pressure on the synonymous sites of coding regions (Kural et al 2009; Ding et al 2012; Ritter et al 2012). We are also actively investigating approaches to determine functional elements in RNA based on a combination of functional genomic, structural, and modeling approaches (Zarringhalam et al 2012).
Another recent lab interest has been the analysis of high-throughput metabolomic data. Our group develops methods to analyze which aspects of lipid content are important to cancer phenotypes (Kiebish et al 2008). Also, a central challenge in metabolomics is the inference of metabolic processes from high-throughput metabolite measurements. We are developing both equilibrium and dynamic models to mechanistically explain the distributions of lipids found in normal and cancerous tissues using statistical inference approaches (Kiebish et al 2010; Zhang et al 2011; Zarringhalam et al 2012).
Our lab is also interested in a variety of issues in molecular evolution related to the balance of functional and neutral pressures in genomes. These interests have included gene expression evolution (Busby et al 2011) and evolution of mutation rates. For example, one puzzle is why mutation rates are uniform in some species, such as the sensu stricto yeasts, while rates vary by location in other species, such as mouse and human. We have found that all mammalian species have regional mutation biases, typically on a scale of several megabases. In contrast, all yeasts have uniform mutation rates, with the exception of the Candida clade (Fox et al 2008; Chuang and Li 2004; Chuang and Li 2007; Chin, Chuang, and Li 2005). In species where the mutation rate is non-uniform, we are interested in questions such as what structural or sequence features affect mutation rates (Imamura et al 2009) and whether gene locations have evolved to make use of mutational heterogeneity.
Principal Investigator: Jeffrey Chuang, Ph.D.
Visiting Investigator: Ivan Dotu, Ph.D.
Zarringhalam K, Meyer M, Dotu I, Chuang JH, Clote P. 2012. Integrating chemical footprinting data into RNA secondary structure prediction. PLoS ONE, in press.
Zarringhalam K, Zhang L, Kiebish MA, Yang K, Han X, Gross RW, Chuang JH. 2012. Statistical analysis of the processes controlling choline and ethanolamine glycerophospholipid molecular species composition. PLoS ONE 7(5): e37293. doi:10.1371/journal.pone.0037293.
Ritter DI, Dong Z, Guo S, Chuang JH. 2012. Transcriptional Enhancers in Protein-Coding Exons of Vertebrate Developmental Genes. PLoS ONE 7(5): e35202. doi:10.1371/journal.pone.0035202.
Ding Y, Lorenz WA, Chuang JH. 2012. CodingMotif: Exact Determination of Overrepresented Nucleotide Motifs in Coding Sequences. BMC Bioinformatics, 13:32.
Busby MA, Gray JM, Costa AM, Stewart C, Stromberg MP, Barnett D, Chuang JH, Springer M, Marth GT. 2012. Expression divergence measured by transcriptome sequencing of four yeast species. BMC Genomics, 12:635.
Lu Zhang, Robert J. A. Bell, Michael A. Kiebish, Thomas N. Seyfried, Xianlin Han, Richard W. Gross, Jeffrey H. Chuang. 2011. A Mathematical Model for the Determination of Steady-State Cardiolipin Remodeling Mechanisms Using Lipidomic Data. PLoS ONE, 6:e21170
Cai D, Ren L, Zhao H, Xu C, Zhang L, Yu Y, Wang H, Lan Y, Roberts MF, Chuang JH, Naughton MJ, Ren Z, Chiles TC. 2010. A molecular-imprint nanosensor for ultrasensitive detection of proteins. Nature Nanotechnology, 5:597.
Ritter DI, Li Q, Kostka D, Pollard KS, Guo S, Chuang JH. 2010. The Importance of Being Cis: Evolution of Orthologous Fish and Mammalian Enhancer Activity. Mol Bio Evol 27:2322.
Kiebish MA, Bell R, Yang K, Phan T, Zhao Z, Ames W, Seyfried TN, Gross RW, Chuang JH, Han X. 2010. Dynamic simulation of cardiolipin remodeling: Greasing the wheels for an interpretative approach to lipidomics. J Lipid Res 51:2153.
Li Q, Ritter D, Yang N, Dong Z, Li H, Chuang JH, Guo S. 2010. A systematic approach to identify functional motifs within vertebrate developmental enhancers. Developmental Biology 337:484.
Kural D, Ding Y, Wu J, Korpi AM, Chuang JH. 2009 COMIT: Identification of Noncoding Motifs under Selection in Coding Sequence. Genome Biology 10:R133.
Merrick CJ, Dzikowski R, Imamura H, Chuang J, Deitsch K, Duraisingh MT. 2009. The effect of Plasmodium falciparum Sir2a histone deacetylase on clonal and longitudinal variation in expression of the var family of virulence genes. Int J Parasitol 40:35.
Imamura H, Karro JE, Chuang JH. 2009. Weak preservation of local neutral mutation rates across mammalian genomes. BMC Evolutionary Biology 9:89.
Kiebish MA, Han X, Cheng H, Chuang JH, Seyfried TN. 2008. Cardiolipin and electron transport chain abnormalities in mouse brain tumor mitochondria: Lipidomic evidence supporting the Warburg theory of cancer. J Lipid Res, 49:2545-2556.
Persampieri J, Ritter DI, Lees D, Lehoczky J, Li Q, Guo S, Chuang JH. 2008. cneViewer: A Database of Conserved Noncoding Elements for Studies of Tissue-Specific Gene Regulation. Bioinformatics 24:2418-2419.
Kiebish MA, Han X, Cheng H, Chuang JH, and Seyfried TN. 2008. Brain Mitochondrial Lipid Abnormalities in Mice Susceptible to Spontaneous Gliomas. Lipids 43:951-959.
Fox AK, Tuch BB, Chuang JH. 2008. Measuring the Prevalence of Regional Mutation Rates: An Analysis of Silent Substitutions in Mammals, Fungi, and Insects. BMC: Evolutionary Biology 8:186.
Kiebish MA, Han X, Cheng H, Lunford A, Clarke CF, Moon H, Chuang JH, Seyfried TN. 2008. Lipidomic Analysis and Electron Transport Chain Activities in C57BL/6J Mouse Brain Mitochondria. J Neurochemistry 106:299–312.
Chuang JH, Li H. 2007. Similarity of Synonymous Substitutions Rates Across Mammalian Genomes. J Molec Evol 65:236.
Imamura H, Persampieri J, Chuang JH. 2007. Sequences Conserved by Selection across Mouse and Human Malaria Species. BMC Genomics 8:372.
Alvarez-Lorenzo C, Concheiro A, Chuang JH, Yu A. Smart Polymers: Applications in Biotechnology and Biomedicine, Second Edition (Grosberg). Chapter title: Imprinting Using Smart Polymers. CRC Press: Boca Raton (2007).
Chin CS, Chuang JH, Li H. 2005. Genome-wide Regulatory Complexity in Yeast Promoters: Separation of Functional and Neutral Sequence. Genome Res 15:205.
Chuang JH, Li H. 2004. Functional Bias and Spatial Organization of Genes in Mutational Hot and Cold Regions in the Human Genome. PLOS Biology 2, 0253, doi:10.1371/journal.pbio.