Skip to main content

Table 3 Selection strategy for the subset based on information without taking into account correlations between haplotype frequency estimates.

From: Assessment of global phase uncertainty in case-control studies

hipROA data genotype nr of individuals 111 112 121 122 211 212 221 loss per genotype total loss
Cases
n = 61
1HH 10 0.25 0.25 0.25 0.25 0 0 0 1.00  
  HHH 7 0 0.03 0.18 0.19 0.19 0.18 0.03 0.79  
  H1H 2 0.19 0.19 0 0 0.19 0.19 0 0.77  
  HH1 3 0.04 0 0.040 0 0.04 0 0.040 0.16  
  no ambiguity 39          
  loss per haplotype   3.00 3.07 3.85 3.83 1.85 1.62 0.31   17.52
Controls
n = 653
H1H 28 0.21 0.21 0 0 0.21 0.21 0 0.83  
  HH1 46 0.18 0 0.18 0 0.18 0 0.18 0.72  
  1HH 91 0.12 0.12 0.12 0.12 0 0 0 0.49  
  HHH 47 0 0.04 0.06 0.10 0.10 0.06 0.04 0.40  
  no ambiguity 441          
  loss per haplotype   25.29 19.09 22.23 15.760 18.59 8.52 10.34   119.81
Simulated data genotype nr of individuals 111 112 121 122 211 212 221 loss per genotype total loss
Cases
n = 500
1HH 83 0.25 0.25 0.25 0.25 0 0 0 1.00  
  HHH 40 0 0.03 0.11 0.13 0.13 0.11 0.03 0.55  
  H1H 26 0.11 0.11 0 0 0.11 0.11 0 0.15  
  HH1 32 0.04 0 0.04 0 0.04 0 0.04 0.15  
  no ambiguity 319          
  loss per haplotype   24.80 24.94 26.26 26.07 9.36 7.17 2.52   121.12
Controls
n = 500
H1H 25 0.23 0.23 0 0 0.23 0.23 0 0.93  
  HH1 36 0.21 0 0.21 0 0.21 0 0.21 0.83  
  HHH 43 0 0.05 0.06 0.11 0.11 0.06 0.05 0.44  
  1HH 70 0 0.11 0.11 0.11 0.11 0 0 0.42  
  no ambiguity 326          
  loss per haplotype   20.68 15.23 17.65 11.89 17.86 8.61 9.55   101.47
  1. The group identifiers denote the genotype at the SNPs, where 1 and 2 stand for homozygote 1/1 and 2/2, and H denotes a heterozygote. The order of the group identifications are determined by the sum of the diagonal elements - the column "loss per genotype" - of the loss matrix i in (3). Individuals with higher loss will results in higher information gain, when their ambiguity could be resolved. The values of the last row, "loss per haplotype", show information loss per haplotype. The simulated data set is the same sample data set as in Table 2.