Overview
A mammalian genome comprises some tens of thousands of genes, and an equal or greater number of additional functional elements. Our studies center on an improved understanding of the interactions between these elements and how modification or disruption of the interactions can lead to developmental problems or genetic disease. We conduct our research at both a global level, characterizing complex interactive networks, and at smaller scales, studying the specific interactions that regulate specific genes or groups of genes. In collaboration with several other Laboratory researchers, we helped establish the Center for Genome Dynamics, which seeks to better understand chromosome evolution, organization and function through a systems genetics approach.
Our analysis of gene regulation is focused on the post-transcriptional stage of expression, where the information has been transcribed to RNA, but not yet translated to protein. Our studies are aimed towards the identification and characterization of sequence elements embedded within the RNA transcript that control its lifetime, localization and translation to protein. As we improve our understanding of the fundamental mechanisms of gene activation, we simultaneously improve our ability to understand how gene regulation can be disrupted, often resulting in either disease or developmental problems.
Scientific report
Computational Studies of Gene Regulation and Genome Organization
Computational analysis of gene regulation and genome organization
The advent of genome-scale biology has provided biologists with enormous amounts of data to analyze, understand, and incorporate into ever-improving models of how organisms function at a molecular level. A mammalian genome comprises some tens of thousands of genes, and an equal or greater number of additional functional elements. Our studies center on an improved understanding of the interactions between these elements and how modification or disruption of the interactions can lead to developmental problems or genetic disease. Our studies are conducted at both a global level, characterizing complex interactive networks, and at smaller scales, where we focus on processes that affect the processing and regulation of specific genes or groups of genes at the mRNA transcript stage of expression.
Evolution and functional organization of metazoan chromosomes
Inbred strains of mice provide a unique opportunity for exploring the forces that shape the genome. These strains of mice represent diverse genomes originally separated by millions of generations that have been scrambled in the laboratory and subjected to intense selection during inbreeding to homozygosity. In collaborative work with Senior Staff Scientists Kenneth Paigen, Beverly Paigen, and Gary Churchill and Associate Research Scientist Petko Petkov, we have established The Center for Genome Dynamics at The Jackson Laboratory, an NIH-NIGMS-funded Center for Systems Biology, which is focused on integrating multiple disparate data types to better understand chromosome evolution, organization, and function.
Nearly all of the data that we work with can be cast into a common pattern, in which we have evidence of putative relationships between distinct loci on the genome. Examples of these data include, but are not limited to, linkage disequilibrium (LD) between single nucleotide polymorphisms (SNPs) that indicate correlated variation among the inbred strains of mice, correlated expression patterns of genes when compared across collections of experiments, curated databases of known pathways and gene networks, and putative interactions such as those implied by "two-hybrid" binding assays. Graph theory provides a natural representation of these data in which nodes represent genomic loci, and edges are created between nodes that have sufficient evidence of association.
Preliminary studies of SNP LD demonstrated that the resulting pattern of chromosome organization includes local domains of functionally related elements that are manifestations of closely connected nodes in the larger network structure. These domains have the potential to promote the co-inheritance and survival of compatible sets of alleles. Comparison of the LD patterns with available gene annotation has identified biological functions underlying some domains and networks. The strong conservation of gene order among mammals indicates that the domains and networks we find arose prior to the mammalian radiation beginning some 90 million years ago and likely characterize all mammals.
Our group is developing the mathematical framework in which the varied data sources can be integrated to better delineate both local and global organization of the genome. Successful modeling of these data requires a broad array of techniques, including, but not limited to, Hidden Markov Models, comparative genome alignment and syntenic analysis, graph theory, and dynamic programming.
Computational studies of post-transcriptional gene regulation
Gene regulation is controlled by a complex mixture of forces that can act at any of the stages of gene activation: transcription of DNA to a precursor RNA, post-transcriptional processing of the RNA, translation of the RNA to a protein, or post-translational modification of the final protein product. Examples of post-transcriptional gene regulation are available across a wide range of organisms and biological processes, and include such varied phenomena as 3´-end formation (cleavage and polyadenylation), splicing, localization, degradation, editing, and translational suppression or enhancement. As we improve our understanding of the fundamental mechanisms of gene regulation, we simultaneously improve our ability to understand how gene regulation can be disrupted, often resulting in either disease or developmental problems.
Post-transcriptional gene regulation is utilized extensively in early mammalian development. We collaborated with Senior Staff Scientist Barbara Knowles and Research Scientist Alexei Evsikov in a study that compared a large (~19,000) EST library from the full-grown oocyte with the previously released two-cell embryo library. In this work, we were able to identify maternal transcripts with differential stability and/or translational control. Such post-transcriptional regulatory control is often mediated through sequence elements embedded in the untranslated regions (UTRs) of the affected transcript. We compared the 3´-UTR sequences of stable and transient transcripts, and identified several motifs that occur at significantly different frequencies. These motifs become putative regulatory controls for further computational and bench investigation.
As is the case in oogenesis, spermatogenesis is characterized by tissue- and stage-specific mRNA processing. In particular, we are interested in the formation of the terminal end of the transcript, a two-step process (referred to as 3´-processing) that consists of a cleavage of the precursor RNA transcript followed by polyadenylation. In collaborative work with Dr. Clinton MacDonald of Texas Tech University, we used tissue-specific EST sequences to identify and characterize 3´-processing sites specific to various stages of spermatogenesis. The 3´-processing control sequence comprises several distinct sequence elements, including the well-known canonical AAUAAA element. Our studies indicate that the balance between these elements changes significantly during spermatogenesis, implying a corresponding change in the activity of the trans-acting protein factors that interact with these elements. The use of alternative 3´-processing sites similarly changes the 3´-UTR sequence, which in turn can alter the post-transcriptional regulation of the affected genes.
Comparative genomic analysis provides a mechanism to better understand and model the small sequence elements that regulate gene processing and expression. We used such an approach to propose a novel mechanism for interaction between precursor RNA and the protein CSTF2, which is a component of the protein machinery responsible for 3´-processing. Through analysis of 3´-processing sites in ten metazoans, we identified putative regulatory elements, manifested as statistically significant patterns of sequence content and relative positioning. Analysis of CSTF2 protein sequences from the same organisms revealed correlated variation of specific amino acid and nucleic acid patterns in the putative binding sites. Our new model for this protein-RNA interaction encompasses and is consistent with a broad body of previous experimental studies.
Lab staff
Principal Investigator: Joel H. Graber, Ph.D.
Scientific Software Engineer: Lucie Hutchins
Software Engineer: Nazira Bektassova
Postdoctoral Fellows: Nicole Leahy, Ph.D., Daniela Kamir, Ph.D.
Collaborators: Carol J. Bult, Ph.D., Gary A. Churchill, Ph.D., Wilhelmine de Vries, Ph.D., John J Eppig, Ph.D., Wayne N. Frankel, Ph.D., Barbara B. Knowles, Ph.D., Kenneth Paigen, Ph.D., Anne E. Peaston, Ph.D., Petko M. Petkov, Ph.D., Kevin D Mills, Ph.D., Clifford J. Rosen, M.D., Lindsay S. Shopland, Ph.D.,Thomas Blumenthal, Ph.D., University of Colorado, Boulder, Keith W. Hutchison, Ph.D., University of Maine, Orono, Clinton C. MacDonald, Ph.D., Texas Tech University, Janet Rowley, M.D., University of Chicago
Research Administrative Assistant: Patricia Cherry
Publication listings
(2005-present)
Salisbury J, Hutchison KW, Wigglesworth K, Eppig JJ, Graber JH. 2009. Probe-level analysis of expression microarrays characterizes isoform-specific degradation during mouse oocyte maturation. PLOS One, 4(10):e7479.
Yang H, Ding Y, Hutchins LN, Szatkiewicz J, Bell TA, Paigen BJ, Graber JH, de Villena FP, Churchill GA. 2009. A customized and versatile high-density genotyping array for the mouse. Nat Methods 6(9):663-666.
Singh P, Alley TL, Wright SM, Kamdar S, Schott W, Wilpan RY, Mills KD, Graber JH. 2009. Global changes in processing of mRNA 3' untranslated regions characterize clilnically distinct cancer subtypes. Cancer Res, (In press).
De Vries WN, Evsikov AV, Brogan LJ, Anderson CP, Graber JH, Knowles BB, Solter D. 2008. Reprogramming and Differentiation in Mammals: Motifs and Mechanisms. Cold Spring Harb Symp Quant Biol 73.
Hutchins LN, Murphy SM, Singh P, Graber JH. 2008. Position-dependent motif characterization using non-negative matrix factorization. Bioinformatics 24(23):2684-2690.
Paigen K, Szatkiewicz JP, Sawyer K, Leahy N, Parvanov ED, Ng SH, Graber JH, Broman KW, Petkov PM. 2008. The recombinational anatomy of a mouse chromosome. PLoS Genet 4(7):e1000119. PMC2440539
Graber JH, Salisbury J, Hutchins LN, Blumenthal T. 2007. C. elegans sequences that control trans-splicing and operon pre-mRNA processing. RNA 13(9):1409-1426.
Liu D, Brockman JM, Dass B, Hutchins LN, Singh P, McCarrey JR, MacDonald CC, Graber JH. 2007. Systematic variation in mRNA 3'-processing signals during mouse spermatogenesis. Nucleic Acids Res 35(1):234-246.
Petkov PM, Graber JH, Churchill GA, DiPetrillo K, King BL, Paigen K. 2007. Evidence of a large-scale functional organization of mammalian chromosomes. PLoS Biol 5(5):e127.
Brown AC, Lerner CP, Graber JH, Shaffer DJ, Roopenian DC. 2006. Pooling and PCR as a method to combat low frequency gene targeting in mouse embryonic stem cells. Cytotechnology 51(2):81-88.
Evsikov AV, Graber JH, Brockman JM, Hampl A, Holbrook AE, Singh P, Eppig JJ, Solter D, Knowles BB. 2006. Cracking the egg: molecular dynamics and evolutionary aspects of the transition from the fully grown oocyte to embryo. Genes Dev 20:2713-2727.
Graber JH, Churchill GA, Dipetrillo KJ, King BL, Petkov PM, Paigen K. 2006. Patterns and mechanisms of genome organization in the mouse. J Exp Zoolog 305A(9):683-688.
Liu D, Graber JH. 2006. Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation. BMC Bioinformatics 7:77.
Salisbury J, Hutchison KW, Graber JH. 2006. A multispecies comparison of the metazoan 3'-processing downstream elements and the CstF-64 RNA recognition motif. BMC Genomics 7:55.
Brockman JM, Singh P, Liu D, Quinlan S, Salisbury J, Graber JH. 2005. PACdb: PolyA cleavage site and 3'-UTR database. Bioinformatics 21:3691-3693.
Petkov PM, Graber JH, Churchill GA, DiPetrillo KJ, King BL, Paigen K. 2005. Evidence of a large-scale functional organization of mammalian chromosomes. PLoS Genet 1(3):e33.