Linkdatagen is a PERL script that generates LINKAGE style files for ALLEGRO , MERLIN, PREST, MORGAN, PLINK, FEstim, BEAGLE and RELATE using as input genotype calls from Affymetrix SNP chips, Illumina SNP chips, or SNP genotypes inferred from massively parallel sequencing (MPS) data, such as whole exome or whole genome sequence data.
- The first incarnation of linkdatagen.pl was only able to process genotypes from Affymetrix SNP chips.
- Subsequently, linkdatagen.pl was renamed linkdatagen_affy.pl. Separate scripts were developed for Illumina SNP chip genotypes (linkdatagen_illumina.pl) and MPS genotypes (linkdatagen_mps.pl and companion script vcf2linkdatagen.pl).
- As of the 15th May 2012, the three linkdatagen scripts have been combined into a single script named linkdatagen.pl. The type of genotypes being processed is indicated by the -data option ('a' for Affymetrix SNP chip data, 'i' for Illumina SNP chip data or 'm' for SNP genotypes from MPS data). vcf2linkdatagen.pl remains a separate script that must be run before using linkdatagen.pl with the -data m option.
HapMap Phase I, II and III data.
Download the Affymetrix build 36 annotation files. Last updated 29th April 2013.
Download the Affymetrix build 37 annotation files. Last updated 15th November 2012.
Download the test data set testdata.tar.gz. Last updated 15th November 2012.
HapMap Phase I, II and III data. We support a range of Illumina chips including the 370Duo, 610Quad, 660Quad, Cyto12, OmniExpress, and 1M chips.
Download the Illumina annotation files (build 37). Last updated 15th November 2012.
vcf2linkdatagen.pl. Last updated 20 August 2012. This is a companion script used to convert VCF files into a BRLMM genotype call file that can be processed by linkdatagen.pl.
Download the vcf2linkdatagen documentation. Last updated 15th May 2012.
Download the HapMap Phase II annotation files. Last updated 6th March 2012. Annotation for up to 4,071,899 SNPs for the four HapMap Phase II populations (CHB, CEU, JPT, YRI).
Download the HapMap Phase III annotation files. Last updated 6th March 2012. Annotation for up to 1,594,838 SNPs for the eleven HapMap Phase III populations.
Download our quick-start guide to processing MPS genotypes for linkage analysis. Last updated 13th November 2012.
Download test data from Smith KR et al (2011). Last updated 18th July 2011.
VCF files containing SNP genotypes at the location of HapMap Phase II SNPs:
(i) Family A: Single affected individual - A-7.HapMapII.SNPs.vcf.gz, recessive family, homozygosity mapping
(ii) Family T: Single affected individual - T-1.HapMapII.SNPs.vcf.gz, recessive family, homozygosity mapping
(iii) Family M: Two affected siblings - M-3.HapMapII.SNPs.vcf.gz and M-4.HapMapII.SNPs.vcf.gz, dominant family.
SNP genotypes from Illumina genotyping arrays for the same individuals:
(i) Family A: A-7.FinalReport.txt.gz
(ii) Family T: T-1.FinalReport.txt.gz
(iii) Family M: M-3.FinalReport.txt.gz and M-4.FinalReport.txt.gz
VCF files containing SNP genotypes at the location of genotyping array SNPs (for concordance checks):
(i) Family A: A-7.IL610Q.SNPs.vcf.gz
(ii) Family T: T-1.IL610Q.SNPs.vcf.gz
(iii) Family M: M-3.ILOE.SNPs.vcf.gz and M-4.ILOE.SNPs.vcf.gz
Email bug reports & questions to Melanie Bahlo (firstname.lastname@example.org).
If you use linkdatagen.pl, please acknowledge by citing:
If you use linkdatagen.pl and/or vcf2linkdatagen.pl to process MPS genotypes, please also cite:
Smith KR, Bromhead CJ, Hildebrand MS, Shearer AE, Lockhart PJ, Najmabadi H, Leventer RJ, McGillivray G, Amor DJ, Smith RJ, Bahlo M (2011). Reducing the exome search space for Mendelian diseases using genetic linkage analysis of exome genotypes. Genome Biology 12:R85.