Software
- QTL mapping
- Gene expression
- Genome Browser
README - MAANOVA 2.0
May 2003
MAANOVA is an extensible interactive environment for the analysis of two-color cDNA microarray experiments. It is implemented as a set of functions for Matlab (www.mathworks.com). The engine functions were written in C for better performance.
MAANOVA stands for MicroArray ANalysis Of VAriance. It provides a complete work flow for microarray data analysis including:
- Data quality checks and visualization
- Data transformation
- ANOVA model fitting for both fixed and mixed effects models
- Statistical tests including permutation analysis
- Confidence interval with bootstrapping
- Cluster analysis with bootstrapping
MAANOVA can be applied to any microarray data but it is specially tailored for multiple factor experimental designs. Mixed effects models are implemented to estimate variance components and perform F and t tests for differential expressions.
I. System requirement
To run MAANOVA 2.0 you will need:
- Matlab version 6.0 for Windows (95/98/NT) or higher
- Statistics toolbox version 2 or higher
All the functions are written and tested in Matlab R13 for windows(version 6.5). It may have problems in earlier Matlab version and other operating systems.
II. Installation
1. Download and save the zip file in a new directory, e.g., c:\maanova.
2. For Windows user, if you have winzip or winrar installed, double click the file icon and extract the files.
3. For Unix user, type the commands
gunzip maanova_2_0_linux.tar.gztar -xvf maanova_2_0_linux.tar
to extract the files from the tar ball.
4. Several engine functions (currently there is only one, clowess.c) are written in C to improve the performance. The compiled binary file and C source code are distributed. The binary codes were compiled and tested in Matlab R12 (Windows/Linux). But you may encounter some problems in other Matlab versions and/or operating systems. If the binary codes do not work for you, you can do the following:
- Delete the binary files (*.dll for windows and *.mexglx for linux) and type "mex -O clowess.c" to recompile the code for yourself. Note that Matlab R12 (version 6.0) or above has a build-in C compiler. If you are using a lower version, you have to have an external C compiler in order to compile the code. For Windows, the compiler could be MicroSoft Visual C, Borland C, etc. For Unix/Linux, the compiler could be gcc. Type
'help mex'in Matlab enviornment to get help.
III. Functionality
MAANOVA is designed for sophisticated users. The users have to write their own script to call the functions. Several demo scripts with data files are available for download. The basic functionality of MAANOVA includes:
- Microarray data normalization and manipulation. Available methods are shift, lowess, loess, linlog and linlogshift.
- Fitting fixed effect ANOVA model of Kerr et. al.
- Hypothesis testing includes:
- F-tests in several flavors
- Multiple test adjustment by one-step permutation method
- Bootstrap Confidence Intervals for VG effects
- Data Visualization
- Ratio-intensity plot for raw and normalized data
- Spatial pattern of data
- Volcano plot for F test results
- Clustering includes:
- Hierarchical clustering
- K-mean clustering
- Bootstrap assessment of clusters
- Consensus tree building
For detailed information about the function list, data structures, function syntax, read the software manual.
IV. Input data:
Before running the software, user should have three data at hand: the raw data (data), number of replicates (rep) and the variety ID (varid). The raw data is a matrix. Number of replicates is an integer and the variety ID is a row vector. The size of the data must be consistent. The number of rows of the raw data should be equal to the number of genes times rep. The number of columns of the raw data should be equal to the length of variety ID vector. If the data size doesnot match to each other, the software will issue an error message.
Format of raw data: The raw data should arranged such that the first two columns are array1, the next two are array2 etc. and the columns alternate in dye label, e.g., odd numbered columns are Cy3 and even columns are Cy5. Replicated measurements of the same clone on the same array should appear in adjacent rows. The input data may have and column headers, in which case the 'tblread' function is most effective for input. If there are no row or column labels in the data file, it is much faster to use the 'load' function.
For feedbacks, bug reports, suggestions, email Hao Wu.