Genotyping-in-thousands by Sequencing (GT-seq): A Cost Effective SNP Genotyping Method Based on Custom Amplicon Sequencing
Genotyping-in-Thousands by sequencing (GT-seq) is a method that uses next-generation sequencing of multiplexed PCR products to generate genotypes from relatively small panels (50–500) of targeted single-nucleotide polymorphisms (SNPs) for thousands of individuals in a single Illumina HiSeq lane. This method uses only unlabelled oligos and PCR master mix in two thermal cycling steps for amplification of targeted SNP loci. During this process, sequencing adapters and dual barcode sequence tags are incorporated into the amplicons enabling thousands of individuals to be pooled into a single sequencing library. Post sequencing, reads from individual samples are split into individual files using their unique combination of barcode sequences. Genotyping is performed with a simple perl script which counts amplicon-specific sequences for each allele, and allele ratios are used to determine the genotypes. We demonstrate this technique by genotyping 2068 individual steelhead trout (Oncorhynchus mykiss) samples with a set of 192 SNP markers in a single library sequenced in a single Illumina HiSeq lane. Genotype data were 99.9% concordant to previously collected TaqMan™ genotypes at the same 192 loci, but call rates were slightly lower with GT-seq (96.4%) relative to Taqman (99.0%). Of the 192 SNPs, 187 were genotyped in ≥90% of the individual samples and only 3 SNPs were genotyped in <70% of samples. This study demonstrates amplicon sequencing with GT-seq greatly reduces the cost of genotyping hundreds of targeted SNPs relative to existing methods by utilizing a simple library preparation method and massive efficiency of scale.
Campbell, N.R., S.A. Harmon, and S.R. Narum. 2014. Genotyping-in-Thousands by sequencing (GT-seq): A cost effective SNP genotyping method based on custom amplicon sequencing. Molecular Ecology Resources 15:855-867.