Microarray Data Analysis
The WEHI Bioinformatics Group contains one of the largest research groups in Australia focusing on the statistical analysis of microarray data.
Members of the microarray group include
Gordon Smyth,
Terry Speed,
Ken Simpson,
Alicia Oshlack,
Wei Shi, Mark Robinson,
Di Wu and Belinda Phipson.
Our Research
Microarrays are a new technology to investigate the expression levels of
thousands of genes simultaneously. Microarrays present new statistical problems
because the data is very high dimensional with very little replication.
Microarrays offer an exciting entry point for statisticians and computational
scientists into the modern areas of computational biology and bioinformatics.
For more information see the Statistical Science Web entry on
Microarray Data Analysis.
Microarray experiments present a wealth of analysis problems and our group
has made a number of important methodological contributions. Our research has
produced novel methods for microarray image processing, for normalisation within
and between arrays, for testing for differential expression and for adjusting
for the multiplicity of tests which much be done, for analysing complex
multi-factor experiments, for constructing joint expression profiles which
discriminate between groups of interest, and for validating and improving the
accuracy of cluster analysis. We have developed a novel software package Spot
for processing cDNA microarray images, in collaboration with CSIRO Mathematical
and Information Sciences, and have pioneered the use of the freeware statistical
computing environment R for microarray data analysis.
Although much progress has been made, many more challenges remain in
microarray data analysis. These challenges arise in part from the inherent
variability of cDNA microarrays at the individual slide and spot level, in part
from the large-scale nature of the data, in part from the novel structure of
microarray data, and in part because the full use of expression profiles for
inferring gene function is still only partly explored. The key areas in
microarray analysis data include experimental design, the assessment of
significance for differential expression, discriminant analysis (supervised
learning) and clustering (unsupervised learning). These are supported by equally
important but lower-level techniques for data acquisition, storage, linkage to
gene databases, normalisation and visualisation.
Selected Recent Publications
- Yang, Y. H., and Speed, T. P. (2002). Design issues for cDNA microarray experiments. Nature Reviews Genetics 3, 579-588.
- Yang, Y. H., Buckley, M. J., Dudoit, S., and Speed, T. P. (2002). Comparison of methods for image analysis on cDNA microarray data. Journal of Computational and Graphical Statistics 11, 108-136.
- Yang, Y. H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J., and Speed, T. P. (2002). Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research 30(4):e15.
- Irizarry, R. A., Hobbs, B., Collin, F., Beazer-Barclay, Y. D., Antonellis, K. J., Scherf, U., Speed, T. P. (2002). Exploration, normalization and summaries of high density oligonucleotide array probe level data. Biostatistics 4(2), 249-64.
- Smyth, G. K., Yang, Y.-H., Speed, T. P. (2003). Statistical issues in
microarray data analysis. Methods in Molecular Biology 224, 111-136. (PDF)
- Smyth, G. K., and Speed, T. P. (2003). Normalization of cDNA microarray
data. Methods 31, 265-273. (PDF)
- Smyth, G. K. (2004). Linear models and empirical Bayes methods for
assessing differential expression in microarray experiments. Statistical
Applications in Genetics and Molecular Biology 3, No. 1, Article 3.
- Wettenhall, J. M., and Smyth, G. K. (2004). limmaGUI: a graphical user
interface for linear modeling of microarray data. Bioinformatics 20,
3705-3706.
- Peart, M. J., Smyth, G. K., van Laar, R. K., Richon, V. M., Holloway, A.
J., Johnstone, R. W. (2005). Identification and functional significance of
genes regulated by structurally diverse histone deacetylase inhibitors. Proceedings of the National Academy of Sciences of the United States of
America.102, 3697-3702.
- Smyth, G. K., Michaud, J., and Scott, H. (2005). The use of within-array
replicate spots for assessing differential expression in microarray
experiments. Bioinformatics 21(9), 2067-2075.
- Stubbs, J., Simpson, K.M., Triglia, T., Plouffe, D., Tonkin, C.J., Duraisingh, M.T., Maier, A.G., Winzeler, E.A., and Cowman, A.F. (2006). Molecular mechanism for switching of P.falciparum invasion pathways into human erythrocytes. Science 309, 1384-7.
- Gilad, Y., Oshlack, A., Smyth, G. K., Speed, T. P., and White, K. P.
(2006). Expression profiling in primates reveals a rapid evolution of human
transcription factors. Nature 440, 242-245.
- Wettenhall, J. M., Simpson, K. M., Satterley, K., and Smyth, G.
K. (2006). affylmGUI: a graphical user interface for linear modeling of
single channel microarray data. Bioinformatics 22, 897 - 899.
- Ritchie, M. E., Diyagama, D., Neilson, J., van Laar, R., Dobrovic, A.,
Holloway, A., and Smyth, G. K. (2006). Empirical array quality weights for
microarray data. BMC Bioinformatics 7, 261.
- Holloway, A. J., Oshlack, A., Diyagama, D. S., Bowtell, D. D. L.,
and Smyth, G. K. (2006). Statistical analysis of an RNA titration series
evaluates microarray precision and sensitivity on a whole-array basis.
BMC Bioinformatics 7, Article 511.
- Oshlack, A., Emslie, D., Corcoran, L., and Smyth, G. K.
(2007). Normalization of boutique two-color microarrays with a
high proportion of differentially expressed probes. Genome
Biology 8, R2.
Software
Tutorials

`Sushi'. Segmentation results from microarray image.
Art in Science, WEHI Director's Prize Winner 2000.
By Yee Hwa Yang, Bioinformatics Group, WEHI.
Comments/Questions? Contact bioinf@wehi.edu.au.
Last modified:
|