Datasets
- QTL mapping
- Gene expression
Result from MMTV-induced tumor study
In order to test whether genetic backgrounds affect the gene expression in MMTV-induced mammary tumors and to identify differential expression genes, three mice from each of four mouse strains, HeN, Hej, YbR, and BALB, were compared to a standard RNA reference, Strategene reference, for gene expression profiling using Ontario mouse 15k microarrays. Each mouse is compared with the reference in a dye-swap fashion, therefore, the whole experiment contains 12 dye-swap pairs on 24 slides.
After hybridization, the slides were scanned using an Axon scanner and the 16-bit tiff images were gridded using SPOT (CSIRO Mathematical and Information Sciences). The background(morph) subtracted mean values are log-transformed and pre-normalized using local lowess to remove spatial and intensity related biases. The data were further analyzed using MAANOVA 1.2 to remove the dye and spot effects and to estimate the relative expression level of each variety at each gene y=ยต+A+D+V+e.
Three types of F tests were used to identify genes that express differently among the mouse strains by comparing the alternative model(which allows each strain a unique variety ID) with the null model(which assigns all mice with the same variety ID). F1 is similar to T-test, which assumes that each gene has a unique error distribution. F3 is a statistically formalized version of fold change to capture the magnitude of expression difference. It assumes that all the genes on the arrays have the same error distribution. The F2 is the hybrid of F1 and F3. It combines one half of the gene-specific error and one half of the mean error across all arrays. Permutation of residuals of anova model was employed to establish the statistic significance of F2 and F3 to avoid distribution assumptions. The volcano plot (Figure 1) shows that 42 spots are significantly different according to the criteria that tabulated P value of F1 (F1.Ptab) is less than 0.001, the multiple-test-orrected P value of F2 (F2.Pvalmax) is less than 0.05, and the multiple-test-corrected P value of F3 (F3.Pvalmax) is less than 0.05. Using all three F tests guarantees a certain magnitude of expression difference and the reproducibility of the selected genes.
Figure 1. Volcano plot for F test results
Because the Ontario mouse 15k arrays are double-spotted for each gene, only the genes with both replicate spots selected by the above criteria are considered as top candidates, which is shown in Gene List 1.
|
|
|
|
|
GROUP 1 |
||
|
18282 |
BG078291 |
"Mus musculus cyclin D2 (Ccnd2), mRNA" |
|
GROUP 2 |
||
|
8604 |
BG065626 |
|
|
22674 |
BG075676 |
|
|
30596 |
BG063668 |
|
|
GROUP 3 |
||
|
15180 |
BG067341 |
|
|
18048 |
BG074047 |
|
|
21678 |
BG067620 |
|
|
22630 |
BG074388 |
|
|
23620 |
BG067670 |
|
|
24652 |
BG076041 |
|
|
27738 |
AU046252 |
|
|
28830 |
BG067439 |
|
|
30064 |
BG066678 |
|
|
30482 |
BG075407 |
|
|
30798 |
BG068331 |
|
|
GROUP 4 |
||
|
8166 |
BG071318 |
|
|
16802 |
BG075190 |
|
|
23780 |
BG071239 |
|
|
GROUP 5 |
||
|
21122 |
BG070089 |
"Mus musculus tumor-associated calcium signal transducer 1 (Tacsd1), mRNA" |
|
Not in any group |
||
|
21564 |
BG065396 |
Homo sapiens mRNA; cDNA DKFZp586L2123 (from clone DKFZp586L2123) |
Gene List 1. Significant Genes for all three Ftests (F1.Ptab<0.001, F2.Pvalmax<0.05; F3.Pvalmax<0.05)
Using K-means clustering for the gene expression, e.g., VG value, these 20 genes are clustered into 5 groups. The VG profiles for each group is shown in Figure 2.
Figure 2. VG profiles of the 20 genes
Since F2 test combines the strength of F1 and F3 against selecting genes with low reproducibility or genes with small expression differences, the full list of genes with F2.Pvalmax < 0.05, which include the 20 genes above and 42 more genes. In the volcano plot (Figure 1), these genes are represented by red points. See Gene list 2 for the significant gene list from F2 test.
|
|
|
|
|
1708 |
BG071503 |
|
|
2556 |
BG075927 |
|
|
2714 |
BG065308 |
Homo sapiens KIAA0396 mRNA, partial cds |
|
4230 |
BG069868 |
|
|
5812 |
BG076059 |
|
|
5902 |
BG063471 |
Mus musculus fibrillarin (Fbl), mRNA |
|
7576 |
BG085134 |
M.musculus Cd24a gene |
|
7668 |
AW550650 |
Mus musculus t-complex testis expressed 1 (Tctex1), mRNA |
|
8166 |
BG071318 |
|
|
8604 |
BG065626 |
|
|
10064 |
BG070224 |
|
|
11752 |
BG076877 |
Mouse creatine kinase B gene, complete cds |
|
13354 |
BG070310 |
|
|
13880 |
BG067352 |
|
|
15180 |
BG067341 |
|
|
15730 |
BG065030 |
M.musculus GSHPx gene |
|
16802 |
BG075190 |
|
|
17482 |
BG075397 |
Mouse CFh locus, complement protein H gene, complete cds, clones MH(4,8) |
|
18048 |
BG074047 |
|
|
18282 |
BG078291 |
Mus musculus cyclin D2 (Ccnd2), mRNA |
|
18672 |
BG086330 |
Mus musculus microsomal glutathione S-transferase (Gst), mRNA |
|
19222 |
AU040587 |
|
|
20004 |
BG073920 |
Mus musculus lactate dehydrogenase 2, B chain (Ldh2), mRNA |
|
20278 |
BG078506 |
Homo sapiens mRNA; cDNA DKFZp566G2246 (from clone DKFZp566G2246) |
|
20546 |
BG071905 |
Homo sapiens mRNA; cDNA DKFZp586L0518 (from clone DKFZp586L0518) |
|
20650 |
AW552541 |
|
|
20920 |
BG078804 |
Mus musculus BMP-4 gene, complete cds |
|
20992 |
BG067264 |
|
|
21010 |
BG080888 |
Homo sapiens cDNA FLJ13397 fis, clone PLACE1001351 |
|
21122 |
BG070089 |
Mus musculus tumor-associated calcium signal transducer 1 (Tacsd1), mRNA |
|
21330 |
BG074398 |
Mus musculus extracellular matrix protein 2 (Ecm2), mRNA |
|
21564 |
BG065396 |
Homo sapiens mRNA; cDNA DKFZp586L2123 (from clone DKFZp586L2123) |
|
21678 |
BG067620 |
|
|
22070 |
BG075854 |
M.musculus ufo mRNA |
|
22162 |
BG077665 |
Mus musculus low density lipoprotein receptor related protein (Lrp), mRNA |
|
22232 |
BG066232 |
Mus musculus high mobility group protein I, isoform C (Hmgic), mRNA |
|
22630 |
BG074388 |
|
|
22674 |
BG075676 |
|
|
22806 |
BG064504 |
|
|
23442 |
BG063693 |
M.musculus gas5 growth arrest specific gene, exons 4-12 |
|
23484 |
BG078467 |
Mus spretus endogenous proviral sequence S3 |
|
23550 |
BG065738 |
Homo sapiens Rho guanine nucleotide exchange factor (GEF) 3 (ARHGEF3), mRNA |
|
23620 |
BG067670 |
|
|
23658 |
BG080910 |
M.musculus of protein S gene, complete CDS |
|
23780 |
BG071239 |
|
|
23794 |
BG071761 |
|
|
24652 |
BG076041 |
|
|
26108 |
BG065586 |
Homo sapiens cDNA FLJ13069 fis, clone NT2RP3001752 |
|
26254 |
BG068139 |
|
|
26470 |
BG073468 |
M.musculus DNA for alpha globin gene and flanking regions |
|
27168 |
BG073624 |
|
|
27738 |
AU046252 |
|
|
27746 |
BG073049 |
|
|
28076 |
BG065049 |
Mouse pro-alpha1 (II) collagen chain gene, complete cds |
|
28386 |
BG085072 |
Homo sapiens epididymis-specific, whey-acidic protein type, four-disulfide core; putative ovarian carcinoma marker (HE4), mRNA |
|
28740 |
BG066208 |
|
|
28830 |
BG067439 |
|
|
28974 |
BG083987 |
Mouse (clone pIL2) B1 dispersed repeat unit |
|
30064 |
BG066678 |
|
|
30482 |
BG075407 |
|
|
30596 |
BG063668 |
|
|
30798 |
BG068331 |
Gene List 2. Significant genes for F2 test
In order to identify genes that are differentially expressed among the mice without considering strains, an alternative model that gives every mouse a unique variety ID is compared to a null model that assigns all mice the same variety ID. F test comparing these two models identified 24 genes significant for all three types of F tests described above. Seven of these 24 genes belong to the gene list1 above. The expression of these 24 genes is used to cluster the 12 mice using a hierarchical cluster approach. The three BALB mice are clustered together. Two of the three YbR mice are clustered together. All the rest mice can not be clustered with high confidence except that mouse 2 of the HeN strain stands out alone, which indicates that the expression of this mouse is unusual.
Figure 3. Consensus tree from sample hierarchical clustering
F tests were conducted to identify the genes that are significantly different between mouse 2 of the HeN strain and the rest of the mice. First, mouse 2 was compared with the mean of all other mice in the experiment. Only three significant genes were identified to be significant according to all three types of F tests.
|
|
|
|
|
17942 |
'BG071923' |
'"Mus musculus myosin light chain 2 (Mlc2), mRNA"' |
|
24302 |
'BG068317' |
'' |
|
30270 |
'BG071387' |
'Homo sapiens mRNA; cDNA DKFZp564A132 (from clone DKFZp564A132)' |
Gene List 3. genes that are different between mouse 2 of strain HeN and the rest
The profile of VGs of these genes in all the mice is shown in Figure 4.
Figure 4. VG profile of mouse 2 versus the rest of the mice
Then, a different set of null and alternative models, which assign each mouse strain a unique variety ID and test the difference between mouse 2 and the other two mice in the group. Six genes were identified by all three F tests and F2 alone identified 28 genes. The VG profile and the gene list are shown in Figure 5 and Gene List 4.
Figure 5. VG profile of mouse 2 vs the rest of the mice - another test
|
|
|
|
|
GROUP 1 |
||
|
19940 |
BG072209 |
"Mus musculus sulfated glycoprotein-2 isoform 1 mRNA, complete cds" |
|
20808 |
BG063515 |
"Mus musculus ferritin heavy chain (Fth), mRNA" |
|
GROUP 2 |
||
|
3808 |
BG087551 |
"Homo sapiens UDP-glucose pyrophosphorylase 2 (UGP2), mRNA" |
|
14716 |
BG071468 |
|
|
16576 |
BG070063 |
"Homo sapiens cDNA: FLJ21267 fis, clone COL01717" |
|
21226 |
BG072404 |
"Homo sapiens mRNA for KIAA0828 protein, partial cds" |
|
21348 |
BG075211 |
"Mouse adipose differentiation-related protein (ADRP) gene, exons 1-8" |
|
21482 |
BG077235 |
"Mus musculus nucleobindin 2 (Nucb2), mRNA" |
|
22494 |
BG084593 |
"Mus musculus EIG-1 (Eig1), mRNA" |
|
27788 |
BG073260 |
|
|
GROUP 3 |
||
|
5662 |
BG085427 |
"Mus musculus high mobility group protein 2 (Hmg2) gene, complete cds" |
|
14114 |
BG072533 |
"Mus musculus heterogeneous nuclear ribonucleoprotein A1 (Hnrpa1), mRNA" |
|
15830 |
BG067430 |
"Mus musculus H3 histone, family 3B (H3f3b), mRNA" |
|
GROUP 4 |
||
|
8580 |
BG064947 |
|
|
14128 |
BG073336 |
|
|
14474 |
BG066300 |
|
|
14998 |
BG063486 |
|
|
16684 |
BG084836 |
"Human DNA for voltage-dependent calcium channel alpha1 subunit (CACN4), exon 48" |
|
17662 |
BG065548 |
|
|
17664 |
BG065383 |
|
|
23092 |
BG083644 |
"Mus musculus P450 (cytochrome) oxidoreductase (Por), mRNA" |
|
GROUP 5 |
||
|
12040 |
BG070902 |
"Mus musculus p8 protein (p8) gene, complete cds" |
|
19280 |
BG084947 |
"Mus musculus p8 protein (P8-pending), mRNA" |
|
GROUP 6 |
||
|
2824 |
BG067549 |
Mouse Ig germline kappa V-region gene V-kappa-24A |
|
17942 |
BG071923 |
"Mus musculus myosin light chain 2 (Mlc2), mRNA" |
|
21774 |
BG083088 |
"Mus musculus cyclin D1 (Ccnd1), mRNA" |
|
24302 |
BG068317 |
|
|
30270 |
BG071387 |
Homo sapiens mRNA; cDNA DKFZp564A132 (from clone DKFZp564A132) |
Gene List 4. genes that are different between mouse 2 of strain HeN and the rest - another test
Since each mouse is the true replicate unit in this experiment, we extracted the VG effect for each mouse and performed one-way-anova on these VGs to identify genes that differentially expressed among strains. Similarly, three F tests using different errors were constructed. The powers of these F tests are low because there are only 12 data points on each gene. No gene was identified using the multiple-text-corrected F2 and F3 even at 0.05 level and 28 genes were identified by F1 as Ptab<0.001, 10 of which belong to Gene list 1. The VG profiles of these 28 genes are shown in Figure 6.
Figure 6. VG profile of VGprofile_FtestVG5_8Ptab
Groups 3, 4, 6 and 7 have very small VG differences among the mouse strains, which is the reason for not being selected by the criteria of all three F tests in Fig1. However, they are significant according to F1 because the variation within each strain is also small.