Microarray Data Analysis

The WEHI Bioinformatics Group contains one of the largest research groups in Australia focusing on the statistical analysis of microarray data.

Members of the microarray group include Gordon Smyth, Terry Speed, Ken Simpson, Alicia Oshlack, Wei Shi, Mark Robinson, Di Wu and Belinda Phipson.

Our Research

Microarrays are a new technology to investigate the expression levels of thousands of genes simultaneously. Microarrays present new statistical problems because the data is very high dimensional with very little replication. Microarrays offer an exciting entry point for statisticians and computational scientists into the modern areas of computational biology and bioinformatics. For more information see the Statistical Science Web entry on Microarray Data Analysis.

Microarray experiments present a wealth of analysis problems and our group has made a number of important methodological contributions. Our research has produced novel methods for microarray image processing, for normalisation within and between arrays, for testing for differential expression and for adjusting for the multiplicity of tests which much be done, for analysing complex multi-factor experiments, for constructing joint expression profiles which discriminate between groups of interest, and for validating and improving the accuracy of cluster analysis. We have developed a novel software package Spot for processing cDNA microarray images, in collaboration with CSIRO Mathematical and Information Sciences, and have pioneered the use of the freeware statistical computing environment R for microarray data analysis.

Although much progress has been made, many more challenges remain in microarray data analysis. These challenges arise in part from the inherent variability of cDNA microarrays at the individual slide and spot level, in part from the large-scale nature of the data, in part from the novel structure of microarray data, and in part because the full use of expression profiles for inferring gene function is still only partly explored. The key areas in microarray analysis data include experimental design, the assessment of significance for differential expression, discriminant analysis (supervised learning) and clustering (unsupervised learning). These are supported by equally important but lower-level techniques for data acquisition, storage, linkage to gene databases, normalisation and visualisation.

Selected Recent Publications

  • Yang, Y. H., and Speed, T. P. (2002). Design issues for cDNA microarray experiments. Nature Reviews Genetics 3, 579-588.
  • Yang, Y. H., Buckley, M. J., Dudoit, S., and Speed, T. P. (2002). Comparison of methods for image analysis on cDNA microarray data. Journal of Computational and Graphical Statistics 11, 108-136.
  • Yang, Y. H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J., and Speed, T. P. (2002). Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research 30(4):e15.
  • Irizarry, R. A., Hobbs, B., Collin, F., Beazer-Barclay, Y. D., Antonellis, K. J., Scherf, U., Speed, T. P. (2002). Exploration, normalization and summaries of high density oligonucleotide array probe level data. Biostatistics 4(2), 249-64.
  • Smyth, G. K., Yang, Y.-H., Speed, T. P. (2003). Statistical issues in microarray data analysis. Methods in Molecular Biology 224, 111-136. (PDF)
  • Smyth, G. K., and Speed, T. P. (2003). Normalization of cDNA microarray data. Methods 31, 265-273. (PDF)
  • Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology 3, No. 1, Article 3.
  • Wettenhall, J. M., and Smyth, G. K. (2004). limmaGUI: a graphical user interface for linear modeling of microarray data. Bioinformatics 20, 3705-3706.
  • Peart, M. J., Smyth, G. K., van Laar, R. K., Richon, V. M., Holloway, A. J., Johnstone, R. W. (2005). Identification and functional significance of genes regulated by structurally diverse histone deacetylase inhibitors. Proceedings of the National Academy of Sciences of the United States of America.102, 3697-3702.
  • Smyth, G. K., Michaud, J., and Scott, H. (2005). The use of within-array replicate spots for assessing differential expression in microarray experiments. Bioinformatics 21(9), 2067-2075.
  • Stubbs, J., Simpson, K.M., Triglia, T., Plouffe, D., Tonkin, C.J., Duraisingh, M.T., Maier, A.G., Winzeler, E.A., and Cowman, A.F. (2006). Molecular mechanism for switching of P.falciparum invasion pathways into human erythrocytes. Science 309, 1384-7.
  • Gilad, Y., Oshlack, A., Smyth, G. K., Speed, T. P., and White, K. P. (2006). Expression profiling in primates reveals a rapid evolution of human transcription factors. Nature 440, 242-245.
  • Wettenhall, J. M., Simpson, K. M., Satterley, K., and Smyth, G. K. (2006). affylmGUI: a graphical user interface for linear modeling of single channel microarray data. Bioinformatics 22, 897 - 899.
  • Ritchie, M. E., Diyagama, D., Neilson, J., van Laar, R., Dobrovic, A., Holloway, A., and Smyth, G. K. (2006). Empirical array quality weights for microarray data. BMC Bioinformatics 7, 261.
  • Holloway, A. J., Oshlack, A., Diyagama, D. S., Bowtell, D. D. L., and Smyth, G. K. (2006). Statistical analysis of an RNA titration series evaluates microarray precision and sensitivity on a whole-array basis. BMC Bioinformatics 7, Article 511.
  • Oshlack, A., Emslie, D., Corcoran, L., and Smyth, G. K. (2007). Normalization of boutique two-color microarrays with a high proportion of differentially expressed probes. Genome Biology 8, R2.

Software

Tutorials

 


`Sushi'. Segmentation results from microarray image.
Art in Science, WEHI Director's Prize Winner 2000.
By Yee Hwa Yang, Bioinformatics Group, WEHI.


Comments/Questions? Contact bioinf@wehi.edu.au.
Last modified: