Skip to main content

Table 1 Discovery and validation criterion for differentiated genomic regions

From: Characterizing the genetic differences between two distinct migrant groups from Indo-European and Dravidian speaking populations in India

Criteria

Discovery criterion

Validation criterion

F ST Region with an over-representation of SNPs possessing high FST values relative to the genome-wide distribution of FST scores

Regional evidence in the top 0.1% of the genome-wide distribution, in which:

Discovered region should contain evidence found in the top 1% of the genome-wide distribution

- Regions are defined by window sizes of 100 kb and 500 kb;

 

- Evidence is defined by the P-value of the exact Binomial test for the proportion of SNPs with FST in the top 1st percentile (100 kb) or 0.1st percentile (500 kb) respectively of the genome-wide distribution score

Differential iHS signals for GIH and INS

At least one SNP with normalized iHS score in the top 0.19% of the genome-wide distribution in one population, but not present in the top 1% of the genome-wide distribution in the other population

At least one SNP in the discovered region should have an iHS score in the top 1% of the genome-wide distribution, but absent in the top 1% of genome-wide distribution of iHS scores in the second population

XP-EHH between GIH and INS

Normalized XP-EHH scores should lie in the top 0.01% of the genome-wide distribution

At least one SNP in the discovered region should lie in the top 0.5% of the genome-wide distribution of the normalized XP-EHH scores

  1. A description of the population genetics metrics used to discover and validate genomic regions that are differentiated between the north Indian Gujarati population (GIH) and the south Indian Tamil population from Singapore (INS).
  2. Abbreviations: iHS integrated haplotype score, XP-EHH cross-population extended haplotype homozygosity.