A mouse's genetics are reflected in its phenotype, its measurable characteristics including appearance, behavior and physiology. We work on the Mouse Phenome Project, an international collaborative effort seeking to comprehensively characterize a large set of commonly used and genetically diverse inbred strains of mice. All the data are collected and disseminated from the Mouse Phenome Database (MPD). About 1000 measurements for phenotypes--including those relevant to atherosclerosis, blood disorders, cancer susceptibility, neurological and behavioral disorders, sensory function defects, hypertension, osteoporosis and obesity--have been acquired for many strains and more data are expected. The MPD also contains extensive genotypic data, which allows for genotype-phenotype association predictions and facilitates efforts to identify and determine the function of genes participating in normal and disease pathways. Its importance to the research community is demonstrated by the steady increase in its use. The MPD is also closely integrated with specific research projects at The Jackson Laboratory.
Mouse Phenome Project
The Mouse Phenome Project is an international collaborative effort to promote the systematic characterization of a defined set of inbred strains and their derivatives. The project was launched in 2000 to complement mouse genome sequencing efforts for determining genome function. Strain characteristics data are contributed by members of the scientific community. Comprehensive data collected from these efforts are standardized, annotated, and consolidated in a central database for worldwide access. The Mouse Phenome Database (MPD; http://phenome.jax.org/phenome), housed at The Jackson Laboratory, serves as the central data repository for the project. User tools are provided through a web interface for query and analysis. Protocols, experimental conditions, and animal environmental history accompany each contributed data set.
Electronic access to annotated and standardized strain data through MPD provides essential baseline information and enables investigators to choose appropriate strains for many systems-based research applications, including physiological studies, drug and toxicology testing, modeling disease processes, and complex trait analysis.
Significance of a strain characteristics database
There are many challenges to identifying genes underlying human diseases and conditions. Although the growing body of sequence knowledge across multiple species brings new insights to human biology, sequence information alone is not adequate for identifying causal genes and aberrant pathways contributing to disease. Phenotypic data is required for this task so that function can be "mapped" or "linked" to the genome. Mapping function to the genome, also called "association mapping," requires three key pieces of information: the genotype (sequence data), environmental conditions acting on that genotype, and the resulting phenotype (functional data). Although it may sound simple at first, several issues must be taken into account. One is that many human diseases are complex traits, involving multiple genes and perhaps one or more environmental insults, so mapping one gene to one trait will be the exception, not the rule. Another issue is that a sufficient number of genotypes must be examined for any given phenotype to deduce map locations with reasonable confidence. And finally, for the most powerful analyses, as many variables as possible must be controlled (held constant). For example, a high-quality data set could be obtained from a set of individuals with unique, defined genotypes tested under a specific diagnostic (phenotyping) protocol and well-controlled, defined environmental conditions. This would be a powerful data set for association mapping analysis. In reality, however, controlling variables in human studies can be ethically and logistically challenging and, in some cases, impossible. The Mouse Phenome Project tackles these issues by taking advantage of the natural genetic variation and phenotypic diversity of inbred strains of mice. Inbred strains have fixed genotypes, providing a powerful renewable resource that can be tested over time and in multiple locations. Reliable genomic sequence data from multiple inbred mouse strains are now available, and SNP maps of significant density are accessible. The list of priority strains and testing guidelines - recommended by members of the research community - are available on the MPD website. These recommendations ensure that data generated in different laboratories and over time are consistent, comparable, and thus the most valuable.
A goal of the Mouse Phenome Project is to facilitate efforts to map function to the genome. To reach that objective, high-quality phenotypic data from a large number of sufficiently genotyped inbred strains are being collected and made publicly available for downloading and analysis. Armed with genotypic data from a diverse set of inbred mouse strains and high-quality phenotypic data on those strains, the challenge becomes one of linking phenotype and genotype through computational methods (in silico analysis). This "phenome approach" can greatly minimize the expense and long timeframes associated with traditional methods of mapping genetic determinants underlying complex phenotypes. TJL computational biologists and statisticians are working diligently on this part of the challenge.
The phenome approach is powerful because it is based on quantitative, empirically derived primary data and is not dependent on, or affected by, data annotation. And because multiple strains (genotypes) are tested, this method of investigation captures complexities of entire biological pathways that are simply not accessible through conventional methods.
The ability to choose strains for a specific research application by accessing and analyzing existing phenotype data can bypass the need for investigators to invest time and resources (re)characterizing strains. This functionality, in turn, accelerates research and leverages existing community resources.
The Mouse Phenome Project is an ongoing effort and new data sets are added as they become available. MPD contains diverse data types from many sources which are organized into a standard framework conducive to efficient processing and data sharing. The data structures are flexible and accommodate genomic and biological annotations and are scalable for managing large quantities of data from different biological levels (molecular, cellular, organ-system, and whole-animal). MPD currently has data for over 700 measurements, including those relevant to atherosclerosis, blood disorders, cancer susceptibility, neurological and behavioral disorders, sensory function defects, infectious disease susceptibility, pulmonary responsiveness, hypertension, osteoporosis, obesity, metabolic syndrome, and other complex diseases.
Several important studies using strain surveys and the phenome approach have been published in the past year on cancer susceptibility, visual acuity, alcohol effects, obesity, and behavior. In some cases, genes have been identified that affect the phenotype of interest. In addition, a new mouse model has been discovered displaying an autism-like phenotype, and new models have been identified for studying metabolic syndrome at a very detailed level. Mouse Phenome Project collaborators are developing a customizable platform based on a set of inbred strains to predict drug-induced liver injury, and other new technologies are being developed for metabolic profiling and detailed characterization of behavioral phenotypes, embryo morphology, and drug efficacy. In the past year, TJL investigators contributed data for a number of important biological and medically relevant parameters. Body composition and bone mineral density data are now available through the MPD website for a special strain set of F1 hybrids (G. Churchill Group). The Assisted Reproductive Technologies (ARTs) group contributed essential reproduction data for superovulation, in vitro fertilization, and recovery by embryo transfer. The Phenotyping Sciences group submitted extensive characterization data for JAX® Mice Tier 1 Strains (the top 11 strains in highest demand by the research community). TJL's Integrative Center for Genetic Regulation of Aging submitted their first of many data sets on aging phenotypes (B. Paigen, D. Harrison, J. Sundberg). Two consomic panels have been tested using the extensive multi-system phenotyping pipelines available at TJL through Phenotyping Sciences and JAX PGA; those data, including hematology and blood chemistry parameters, will be posted pending review (Davisson Group; K. Svenson and JAX PGA Group).
In addition to phenotypic data, MPD contains a large collection of genotypic data. The most recent acquisition brings the number of SNPs to about 10 million genome-wide locations consolidated from several community sources. This collection includes the TJL SNP data set (P. Petkov and colleagues) and a dense set of SNPs from 16 JAX® Mice inbred strains (including C57BL/6J; from the Perlegen large-scale resequencing project funded by NIEHS). Mitochondrial SNP data from those 16 JAX® strains are available as well. MPD SNP records are linked to MGD, Ensembl, and NCBI dbSNP. These connections are indispensable for facilitating genotype-phenotype association predictions and efforts to identify genes contributing to normal and disease pathways.
As a data repository, MPD provides downloads of phenotypic and genotypic data. MPD offers a number of analysis tools to support exploratory data analysis and discovery. In terms of new tools, an effective method was developed to assist researchers in locating genomic regions where strains (or strain sets) differ the most. The rationale is to point to regions of the genome that may affect phenotypes of interest and potentially lead to the discovery of candidate genes.
Members of the research community are encouraged to contribute and share strain survey data. The Mouse Phenome Project accepts data submissions and posts appropriate and carefully reviewed content on the MPD website. The Mouse Phenome Project seeks to establish new collaborations representing a wide variety of phenotypic domains of medical relevance. The NIH and other funding agencies support experts in their fields of study both for primary phenotyping and for more in-depth, domain-specific characterization.
Software Engineer: Stephen C. Grubb, M.S.
Scientific Curator: Terry Maddatu, DVM
(2004 - Present)
Grubb SC, Maddatu TP, Bult CJ, Bogue MA. 2009. Mouse phenome database. Nucleic Acids Res. 37(Database):D720-30.
Goios A, Gusmão L, Rocha AM, Fonseca A, Pereira L, Bogue M, Amorim A. 2008. Identification of mouse inbred strains through mitochondrial DNA single-nucleotide extension. Electrophoresis 29(23):4795-802.
Taylor CF, Field D, Sansone SA, Aerts J, Apweiler R, Ashburner M, Ball CA, Binz PA, Bogue M, et al. 2008. Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project. Nat Biotechnol. 26(8):889-96.
Bogue MA, Grubb SC, Maddatu TP, Bult CJ. 2007. Mouse Phenome Database (MPD). Nucleic Acids Res 35:D643-9.
Feuerer M, Jiang W, Holler PD, Satpathy A, Campbell C, Bogue M, Mathis D, Benoist C. 2007. Enhanced thymic selection of FoxP3+ regulatory T cells in the NOD mouse model of autoimmune diabetes.Proc Natl Acad Sci USA 104(46):18181-6.
Frazer KA, Eskin E, Kang HM, Bogue MA, Hinds DA, Beilharz EJ, Gupta RV, Montgomery J, Morenzoni MM, Nilsen GB, Pethiyagoda CL, Stuve LL, Johnson FM, Daly MJ, Wade CM, Cox DR . 2007. A sequence-based variation map of 8.27 million SNPs in inbred mouse strains. Nature 448(7157):1050-3.
Mouse Phenome Database Integration Consortium. 2007. Integration of mouse phenome data resources. Mamm Genome 18:157-163.
Goios A, Pereira L, Bogue M, Macaulay V, Amorim A. 2007. mtDNA Phylogeny and Evolution of Laboratory Mouse Strains. Genome Res 17:293-298.
Fenske TS, McMahon C, Edwin D, Jarvis JC, Cheverud JM, Minn M, Mathews V, Bogue MA, Province MA, McLeod HL, Graubert TA. 2006. Identification of candidate alkylator-induced cancer susceptibility genes by whole genome scanning in mice. Cancer Res 66(10):5029-38.
Ohmura K, Johnsen A, Ortiz-Lopez A, Desany P, Roy M, Besse W, Rogus J, Bogue M, Puech A, Lathrop M, Mathis D, Benoist C. 2005. Variation in IL-1beta gene expression is a major determinant of genetic differences in arthritis aggressivity in mice. Proc Natl Acad Sci U S A. 102(35):12489-94.
Bogue MA, Grubb SC. 2004. The Mouse Phenome Project. Genetica 122:71-74.
Churchill GA, Airey DC, Allayee H, Angel JM, Attie AD, Beatty J, Beavis WD, Bogue, M, et al. 2004. The Collaborative Cross, a coummunity resource for the genetic analysis of complex traits. Nat Genet 36(11):1133-7.
Pletcher MT, McClurg P, Batalov S, Su AI, Barnes SW, Lagler E, Korstanje R, Wang X, Nusskern D, Bogue MA, Mural RJ, Paigen B, Wiltshire T. 2004. Use of a dense single nucleotide polymorphism map for in silico mapping in the mouse. PLOS Biology 2:e93.
Grubb SC, Churchill GA, Bogue MA. 2004. A collaborative database of inbred mouse strain characteristics. Bioinformatics 20:2857-2859.