WEHI Bioinformatics

EffiSim - Measuring the efficiency of mutations screens for recessive phenotypes

EffiSim is program that calculates two different measures of efficiency of mutation screens for recessive phenotypes. The first measure is the expected number of distinct mutations screened, and the second is the expected number of mutations screened for a given amount of work. The first measure is called the efficiency and the second is called the balanced efficiency. This program can be used to determine the most efficient design (i.e. it will screen, on average, the largest number of mutations for a given amount of work), given the costs of a screen. It can also be used to help determine which of the backcross (BC) and intercross (IC) breeding strategies is most appropriate for a given mutation screen.

To download EffiSim, click here and then untar with the command:
tar -xvf EffiSim.tar

A screen for recessively acting mutations requires a three-generation breeding strategy to produce homozygotes for the novel mutations. The design of a screen describes the numbers of mice bred at each generation. Let us define a few variables. Let x be the number of mutations inherited per G₁, d be the number of G₁ pairs, h be the number of G₂ crosses per G₁ pair (h = 1 for the backcross, however the intercross can have h > 1), k be the number of G₂ females mating with each G₂ male, n be the number of G₃ pups born to each G₂ male and r be the number of pups born to each G₂ male (thus r = nk).

An important feature of EffiSim is the cost equation, or the weights attributed to each generation. Any G₁ mice carries, and therefore can transmit, only a finite number of mutations. The law of diminishing returns suggests that the greatest returns (in terms of the numbers of mutations screened) will come after screening the first few mice. This line of thought implies that to screen the greatest number of distinct mutations, one should screen only few G₃ progeny of any G₁ pair. However, the generation of the G₁ and G₂ mice is time-consuming and expensive in itself. This implies that it is worth screening many G₃ pups to make this effort worthwhile. Hence there are two competing interests that must be balanced. This can be done by finding the screen that has the greatest balanced efficiency - it will screen the greatest number of equations for a given amount of work.

To do this requires careful consideration of where the costs go in a screen. One simple way of doing this would be to calculate the average amount of work required (w_Gi) for each mouse in generation i. Using the design parameters above (d, h, k, and r), we can calculate the number of mice in each generation. Putting these things together, we have a way of calculateing the costs of a screen that uses the same variables that are required to estimate the returns. Thus it is important that the weights accurately represent the cost/labour/time/space/effort involved in conducting the screen. The equation used to calculate the cost of the screen is g(d,h,k,r) = 2dw_G1 + 2dhk_G2 + dhr_G3. This equation has been designed to take into account all mice bred, not only mice that are used in the screen (e.g. in the backcross, any male G₂ mice are not used in the screen). It should be altered as necessary.

There are two ways of using EffiSim. The first requires you to enter the design of the mutation screen through command line. The second allows you to describe a range of designs (or simply a single design) through a list of possible values for each of the parameters, and to enter these into a file. In either case, EffiSim will provide the results in an output file.

To run EffiSim, you will require PERL. To run this script, type:
perl EffiSim.pl [options]

Option	What it changes	Default
`-p`	Specifies the parameters of the screen. They proceed in the following order, without spaces, separated by commas: length of the target interval (either 'g' for a genome-wide screen or, for a regional screen, the length in cM), breeding scheme ('ic' for the intercross, and 'bc' for the backcross), type of screen ('m' for mutation screen, and 's' for sensitised screen), how to fix the number of G₃ pups (r means fixed per G₂ male and n means fixed per G₂ female), the number of G₁ pairs (d), the number of G₂ crosses per G₁ pair (h), the number of G₂ females mating with each G₂ male (k), the fixed number of G₃ pups (either per G₂ male or per G₂ female).	No default. Either `-p` or `-f` must be specified
`-f`	The name of file that specifies a range of parameters over which the efficiency should be calculated. It must follow the format of the file Example.txt. Most of the other options can be specified using this parameter file. If you want to specify weights other than 1 for all generations, then you must use this option	No default. Either `-p` or `-f` must be specified
`-n`	The name of the file where the results are to be sent.	No default. This file must be specified
`-x`	Specifies the expected number of mutations inherited by each G₁ pup. This must be an integer.	100
`-y`	If a mutation is 'screened', it must be homozygous in y mice. This must be a positive integer.	1
`-o`	Output type. The short option,'s' gives all the results for each design in a single line, while the long option 'l' lists the number of muations that are screened y mice. If you want to compare a large number of screen designs, viewing the results is probably easiest in the short format.	s
`-s`	Are simulations performed or not? If not, it is followed by 0. Then only the theoretical efficiency is calculated (this may be much faster). If a positive integer follows this option, then this will provide the number of times the simulation is repeated.	0

Each option must be followed by a colon and then the value that is required. Spaces are required in between different options, but not between the option, its colon and its value. Here are a few examples:
perl EffiSim.pl -f:Example.txt -n:output.txt -o:s perl EffiSim.pl -p:g,bc,m,r,2,1,4,20 -n:output.txt -o:l -x:50 -y:2 -s:100

To reference this program, please cite:
Silver J.D, Hilton D.J., Bahlo M., Kile B.T. (2006) Efficient breeding strategies for the generation of ENU mutant mice with recessive phenotypes. Submitted for publication.

Finally, there is a file that lists all bugs/revisions/comments that have been reported.

Comments/Questions? Contact Jeremy Silver: silver@wehi.edu.au.

Last modified: 19-12-2005