TY - JOUR
T1 - Global patterns of STR sequence variation
T2 - Sequencing the CEPH human genome diversity panel for 58 forensic STRs using the Illumina ForenSeq DNA Signature Prep Kit
AU - Phillips, Christopher
AU - Devesse, Laurence
AU - Ballard, David
AU - van Weert, Leanne
AU - de la Puente, Maria
AU - Melis, Stefania
AU - Álvarez Iglesias, Vanessa
AU - Freire-Aradas, Ana
AU - Oldroyd, Nicola
AU - Holt, Cydne
AU - Syndercombe Court, Denise
AU - Carracedo, Ángel
AU - Lareu, Maria Victoria
PY - 2018/8/13
Y1 - 2018/8/13
N2 - The 944 individuals of the CEPH human genome diversity panel (HGDP–CEPH), a standard sample set of 51 globally distributed populations, were sequenced using the Illumina ForenSeq™ DNA Signature Prep Kit. The ForenSeq™ system is a single multiplex for the MiSeq/FGx™ massively parallel sequencing instrument, comprising: amelogenin, 27 autosomal STRs, 24 Y-STRs, 7 X-STRs, and 94 SNPforID+Kiddlab autosomal ID-SNPs (plus optionally detected ancestry and phenotyping SNP sets). We report in detail the patterns of sequence variation observed in the repeat regions of the 58 forensic STR loci typed by the ForenSeq™ system. Sequence alleles were characterized and repeat region structures annotated by aligning the ForenSeq™ sequence output to the latest GRCh38 human reference sequence, necessitating the reversal and re-alignment of STR allele sequences reported by the Forenseq™ system in 20 of 58 STRs (plus the reverse alleles in two Y-STRs with duplicated-inverted repeat regions). Individual population sample sizes of the HGDP–CEPH panel do not allow reliable inferences to be made about levels of genetic variability in low frequency STR alleles-where particular sequence variants are found in only a few individuals; but we assessed the occurrence of both population-specific sequence variants and singleton observations; finding each of these in a sizeable proportion of HGDP–CEPH samples, with consequences for planning the co-ordinated compilation of sequence variation on a much larger scale than was required before by forensic laboratories now adopting massively parallel sequencing.
AB - The 944 individuals of the CEPH human genome diversity panel (HGDP–CEPH), a standard sample set of 51 globally distributed populations, were sequenced using the Illumina ForenSeq™ DNA Signature Prep Kit. The ForenSeq™ system is a single multiplex for the MiSeq/FGx™ massively parallel sequencing instrument, comprising: amelogenin, 27 autosomal STRs, 24 Y-STRs, 7 X-STRs, and 94 SNPforID+Kiddlab autosomal ID-SNPs (plus optionally detected ancestry and phenotyping SNP sets). We report in detail the patterns of sequence variation observed in the repeat regions of the 58 forensic STR loci typed by the ForenSeq™ system. Sequence alleles were characterized and repeat region structures annotated by aligning the ForenSeq™ sequence output to the latest GRCh38 human reference sequence, necessitating the reversal and re-alignment of STR allele sequences reported by the Forenseq™ system in 20 of 58 STRs (plus the reverse alleles in two Y-STRs with duplicated-inverted repeat regions). Individual population sample sizes of the HGDP–CEPH panel do not allow reliable inferences to be made about levels of genetic variability in low frequency STR alleles-where particular sequence variants are found in only a few individuals; but we assessed the occurrence of both population-specific sequence variants and singleton observations; finding each of these in a sizeable proportion of HGDP–CEPH samples, with consequences for planning the co-ordinated compilation of sequence variation on a much larger scale than was required before by forensic laboratories now adopting massively parallel sequencing.
KW - Autosomal STRs
KW - CEPH Human genome diversity panel
KW - Massively parallel sequencing
KW - STR
KW - X-STRs
UR - http://www.scopus.com/inward/record.url?scp=85052902391&partnerID=8YFLogxK
U2 - 10.1002/elps.201800117
DO - 10.1002/elps.201800117
M3 - Article
AN - SCOPUS:85052902391
SN - 0173-0835
JO - Electrophoresis
JF - Electrophoresis
ER -