Statistical Genetics Group

WEHI Statistical Genetics Group photo

Left to right: Peter Diakumis, Rick Tankard, Peter Hickey, Melanie Bahlo, Katherine Smith, Lyndal Henden, Natacha Tessier, Saskia Freytag, Vesna Lukic, Dineika Chandrananda
Inset left to right: Karen Oliver, Thomas Scerri, Stuart Lee, Anna Quaglieri

The statistical genetics group (Bahlo lab) work on statistical genetics problems in both human and mouse.

Members of the Bahlo lab are:

  • Melanie Bahlo (Head of Lab)
  • Thomas Scerri Postdoc
  • Saskia Freytag Postdoc
  • Dineika Chandrananda PhD Student
  • Katherine Smith PhD Student
  • Rick Tankard PhD Student
  • Lyndal Henden PhD Student
  • Karen Oliver Masters Student
  • Stuart Lee Masters Student
  • Peter Diakumis Research Assistant
  • Natacha Tessier Research Assistant
  • Anna Quaglieri Research Assistant
  • Peter Hickey Research Assistant
  • Vesna Lukic Research Assistant

Some of the past students and staff are:

We are always looking for students to join the lab, either as a UROP student, honours student (through Melbourne University Maths & Stats) or PhD student. Please note that WEHI requires PhD students to attract a PhD scholarship. For further information see Bioinformatics information for students and WEHI Education.

Our Research

One of the major aims of statistical genetics is to localise genes influencing traits. This includes disease causing genes.
We can apply traditional statistical methods such as likelihood modelling to identify likely regions of the genome that harbour genes of interest. This then allows geneticists to examine the few remaining candidates biologically for association with the trait. It is usually not possible to check all 50,000 or so human genes as these methods may not reflect the DNA changes, either not at all or too weak to survive genome wide multiple testing adjustments.

The first step is to gather familial information such as pedigrees (who is related to whom) and measurements of the trait (quantitative or qualitative). Then we carry out a genome wide scan. In other words we have no prior guesses (candidates) as to which gene may be involved in influencing the trait. We use many hundreds of markers in a genome wide scan which are approximately evenly spaced along the human genome. Because we tend to inherit parts of chromosomes from our ancestors we can follow the path of inheritance for these markers through the pedigrees. If this path coincides with a pattern of the trait, i.e. a particular form of a marker, called an allele, is always associated with a high value of the trait then evidence mounts that this particular marker is actually located close to a gene influencing this trait.

This method, known as linkage analysis, comes in many different flavours and has been successfully used for many decades to map genes such as the gene for Huntingdon's disease, Muscular dystrophy and the breast cancer genes BRCA1 and BRCA2.

The current marker of choice is the SNP (Single Nucleotide Polymorphism) marker. An example would be

GTATGTTCAAC (maternally inherited DNA)

GTAAGTTCAAC (paternally inherited DNA)

Fortunately many hundereds of thousands of these exist throughout the human genome. The process of determining the form, or alleles, at each of these SNP markers for each individual is known as genotyping. In order to be able to genotype individuals we usually need buccal swabs (mouth swabs) or a little bit of blood, from which the DNA is extracted. We now have access to high throughput genotyping technologies that allow the genotyping of ~1 million SNPs per person within 24 hrs for ~$1000 AUD.

These high throughput platforms were developed rapidly in the last 5 years to enable the hunt for common complex disease causes using the phenomena of Linkage Disequilibrium (LD). This was also facilitated through massive projects such as the HAPMAP project.

The AGRF is a high throughput genotyping facility which can genotype many markers for many individuals in a single day. There is considerable technology involved in the genotyping process.

The pedigree, trait and genotyping information is then combined and analysed with probability models which measure the significance of the linkage. This is a computationally challenging problem since the possible number of ways the genotyping and trait data could have been transmitted through the pedigree is often very large. Hence we use fast computers with lots of memory to carry out these calculations.

One can also use experimental animal models, such as mice, to map disease genes. These also form pedigrees and we collect the trait data, also known as phenotyping data, and genotyping data in a similar way.

We collaborate closely with the Molecular Medicine Division on murine mapping projects and with the AGRF (Australian Genome Research Facility) on genotyping and quality control.

Selected Recent Publications

For a list of selected recent publications please see Dr Melanie Bahlo’s profile.

Software

  • LINKDATAGEN Preparation and QC of SNP and MPS data for linkage and relatedness analysis (Smith et al 2011, Bahlo and Bromhead 2009)
  • BrainGEP In Silico gene prioritisation using Allen Human Brain Atlas microarray data (Oliver, Lukic, et al 2014)
  • Mutation Age Estimation A simple R script which performs the calculations required to produce mutation age estimates and confidence intervals (Gandolfo, Bahlo and Speed 2014)
  • HAPLOCLUSTERS Haplotype association mapping tool (Bahlo et al 2006)