 Research article
 Open Access
 Published:
Entropybased selection for maternalfetal genotype incompatibility with application to preterm prelabor rupture of membranes
BMC Genetics volume 15, Article number: 66 (2014)
Abstract
Background
Maternalfetal genotype incompatibility (MFGI) is increasingly reported to influence human diseases, especially pregnancyrelated complications. In practice, it is challenging to identify the ideal incompatibility model for analysis, since the true MFGI mechanism is generally unknown. The underlying MFGI mechanism for different genetic variants can vary, and to use a single incompatibility model for all circumstances would cause power loss in testing MFGI.
Results
In this article, we propose a practical 2step procedure that incorporates a model selection strategy based on an entropy measurement to select the most appropriate MFGI model represented by data and test the significance of the MFGI effect using the chosen model within the generalized linear regression framework.
Conclusions
Our simulation studies show that the proposed twostep procedure controls the type I error rate and increase the testing power under various scenarios. In a real data application, our analysis reveals genes having an MFGI effect, which may not be detected with a nonmodel selection counterpart.
Background
Current advances in highthroughput biotechnology have popularized genomewide association studies (GWAS) to detect genetic variants that increase the risk of complex diseases. Over the past decade, thousands of single nucleotide polymorphisms (SNPs) have been reported to be associated with various human diseases. Despite the numerous successes of GWAS, the majority of heritability for many complex diseases remains unexplained [1–5]. Recent genomic research provides compelling evidence that the cause of complex human diseases is multifactorial and involves both genetic and environmental factors. The lack of consideration of sophisticated components like genegene interactions, geneenvironment interactions, and epigenetic functions can lead to the missing heritability for most common diseases.
The underlying genetic architecture can be especially complicated for diseases developed during human pregnancy, since both maternal and fetal genomes are involved. In general, the fetus inherits one copy of the genome from each of its parents, and the two copies are not identical. Previous familybased or twin studies indicated that the heritability for obstetric diseases is high. For example, it is reported in an earlier twins study that heritability was 17% for preterm delivery in first pregnancy and 27% for preterm delivery in any pregnancy [6] and heritability range of 25%–40% was suggested for birthweight and gestational length in another study [7]. Maternal and fetal genes, either individually or in combination, could increase the risk of diseases such as hemolytic disease of the newborn [8], preterm birth [9, 10], small for gestational age [11], preeclampsia [12–14], and preterm prelabor rupture of membranes (pPROM) [15]. The incompatibility between maternal and fetal genotypes, in which the expression of genes from two generations lead to an opposite effect, plays a vital role and can increase the risk of these diseases. However, most current association studies on obstetric diseases have primarily focused on only one genome for susceptibility genomic loci; that is only the maternal or fetal genome was searched for associated genetic factors when a maternal or fetal disorder was studied.
Evidence support the important role of interaction between maternal and fetal genes, more than maternal genes alone for the etiology of pregnancy complications, are accumulating [16–19]. In other words, an increased risk of certain disorders could be due to a specific combination of maternal and fetal genotypes. The mother and fetus share only one allele. Mismatches between maternal and fetal genotypes may lead to adverse effects when a fetus resides in utero and increase the risk of disease. A good example of this deleterious effect comes from the allogenic response. If a biallelic locus has a null allele and an antigencoding allele, the mother is homozygous for the null allele, and the fetus inherits an allele from the father which codes for an antigen, the mother may produce an allogenic response to the fetal antigen, which is harmful to the fetus. This type of incompatibility between maternal and fetal genotypes is well illustrated by Rh incompatibility, which is developed when a pregnant woman is Rhnegative (d/d) and the fetus is Rhpositive (D/d) in the RhD locus. Red blood cells from the fetus can cross into the maternal blood stream through the placenta. The maternal immune system treats Rhpositive fetal cells as external attacks and makes antibodies against the fetal blood cells. These antibodies may cross back into the developing fetus and destroy its circulating blood cells, which can cause hemolytic disease of the newborn (HDN). Therefore, the identification of genes with maternalfetal genotype incompatibility (MFGI) by searching parental and offspring genomes simultaneously is highly recommended [20–26].
Study designs in which data are collected from parentoffspring triads or motheroffspring dyads are the most commonly used to investigate the marginal and joint effects of maternal and fetal genes. Most currently available statistical approaches for analyzing this type of data fall in the framework of generalized linear regression models. Maternal fetal genotype tests based on the loglinear modeling for childparent triads have been developed [20, 27–29]. These tests are robust to population stratification because they compare the distribution of affected and unaffected individuals given the parental mating type instead of comparing frequencies of alleles/genotypes between cases and controls. However, these tests and their extensions require at least some paternal data are available. For situations when paternal data are 100% missing, the dyad sampling data, methods based on logistic regression models were proposed [22, 23, 25].
Although it has been widely hypothesized that mismatches between maternal and fetal genotypes can cause incompatibility, the underlying biological mechanism remains unclear. Therefore, it is challenging to appropriately model incompatibility and code the corresponding variable accordingly. That is, suppose a variable G_{ i c } denotes the MFGI effect, it is problematic to decide when to code the variable as 1 or 0. Parimi et al. [30] evaluated the performance of 6 plausible incompatibility models and concluded that the most comprehensive model, which codes genotype incompatibility whenever maternal and fetal genotypes are different, consistently outperformed other models. However, only the maternalfetal incompatibility effect was simulated in their study, and the maternal main effect and the fetal main effect were not considered along with MFGI. When a maternal or fetal main effect coexists with MFGI, this approach dramatically inflates the type I error. Even if only an incompatibility effect is present, the recommended model does not always achieve greater power than the true incompatibility models.
In this study, we developed a 2step statistical strategy for testing MFGI effects in designs that collect data from the mother and offspring that can increase the testing power under a wide range of scenarios. We propose to select the MFGI model based on an entropy measurement via a permutation procedure; then we test the MFGI effect using the selected incompatibility model within the logistic regression framework.
Methods
Genetic model
Consider a study that enrolls case and control motheroffspring pairs from a target population. Collected data include genotypes of mothers and offspring, disease phenotype (phenotype of mother or child) of interest, and other covariates with a total of n independent motheroffspring pairs (n_{0} controls and n_{1} cases, n_{1} + n_{0} = n). Let G_{ m }and G_{ o }denote the maternal and fetal genotypes of a particular SNP, respectively. Under the commonly used additive genetic model, G_{m/o}= 0,1, or 2 if the mother/offspring has 0, 1, or 2 copies of the minor allele. Let Y = (y_{1},y_{2},⋯,y_{ n })^{T}denote the vector of the phenotype, where y_{ i }is the dichotomous disease outcome of the i^{th}family unit in the sample, in which y_{ i }= 1 or 0 corresponds to the affected or unaffected individuals.
Consider a biallelic genomic locus with 2 alleles: A and a, where A denotes the rare allele. Following the Mendelian inheritance, there are seven possible maternalfetal genotype combinations (see Table 1). The 4 mismatched maternalfetal genotype combinations are denoted as M_{1},M_{2},M_{3}, and M_{4}. It is possible that any of the mismatched maternalfetal genotype combination leads to incompatibility or that only a specific mismatched genotype combination or a certain collection of these genotype combinations is associated with the risk of disease.Therefore, in the absence of evidence from molecular genetics analysis, it is challenging to determine which incompatibility model fits the biological mechanism. Here, we consider 11 biologicallyplausible incompatibility models(Table 2) and propose a 2step procedure to identify genomic loci that have a MFGI effect on a disease outcome of interest. We first select an MFGI model based on an entropy measurement and then test the statistical significance of MFGI using the chosen incompatibility model. Details of the 2step procedure are described in the following section.
Statistical model
The information theory, which was initially developed in the 1940s [31] to quantify the transmission of information in communication channels within a rigorous mathematical framework, has gained much attention in genetic association studies in recent years [32–35]. Our aim is to propose a model selection strategy to choose the MFGI model best represented by the data using the entropy theory. Before introducing the model selection strategy, we discuss some basic concepts about the information theory. Entropy measures the uncertainty of a random variable. For a discrete random variable X, entropy is defined as:
where x_{ i },P (X = x_{ i }),i = 1,2,⋯,d are the possible values of X and their corresponding probabilities; b is the base of the logarithm and is commonly assumed to be 2 in the information theory. We propose the following 2step procedure to test MFGI effects: Step 1: Select the MFGI model Let p and 1  p be proportions of cases and controls, respectively, in a given data set. Entropy of the disease outcome can be computed
This entropy serves as a measure of the uncertainty of disease outcome in the initial data set.
Under each of the 11 plausible MFGI models listed in Table 2, the mother or offspring can be characterized as “high risk” or “low risk” based on their genotype combinations. For example, under Model 1, motheroffspring pairs with genotype combination M_{1} = (A A,A a) are considered “high risk” and other combinations are considered “low risk”. The high and low risk labels split the initial data set into 2 subsets. Entropy of disease outcome within each subset, H (Drisk = high) and H (Drisk =low), can be calculated using Equation (2). The conditional entropy of disease status, given a particular MFGI model, is then defined as
This conditional entropy measures the remaining amount of uncertainty of disease outcome given the MFGI model. The difference between this conditional entropy and the original entropy is the information gain (or mutual information), which reflects the amount of information that a certain MFGI model provides (Equation (4)).
To adjust for the uncertainty of disease status due to sampling, the information gain ratio was used (Equation (5)) as the criterion to select the optimal model to code the MFGI effect.
As shown in Table 2, Model 11 is the most comprehensive model because it includes all 4 incompatible maternalfetal genotype combinations. The study by Parimi et al. (2008) recommends this model as “optimal” when decoding the MFGI effect. Herein we consider this model as the default model. The information gain ratio was calculated for each of the 11 plausible MFGI models and, then we selected the model that has the largest information gain ratio as the candidate model. Since a candidate model could be chosen by chance and does not reflect the real functional mechanism, a permutation procedure is used to assess how likely the candidate model will be chosen under the assumption of no genetic association as follows:

1.
Obtain the information gain ratio {R _{ i },i = 1,2,⋯,11} for each model and identify the model with the maximum information gain ratio R ^{max}= max {R _{1},R _{2},⋯,R _{11}} as a candidate model;

2.
For b = 1,2,⋯,B, permute the disease label and obtain the maximum information gain ratio ${R}_{b}^{\mathit{\text{max}}}=\mathit{\text{max}}\{{R}_{1,b},{R}_{2,b},\cdots \phantom{\rule{0.3em}{0ex}},{R}_{11,b}\}$;

3.
Calculate the empirical pvalue of selecting the model by chance
$$p\mathit{\text{value}}=\frac{1}{B}\sum _{b=1}^{B}I({R}_{b}^{\mathit{\text{max}}}>{R}^{\mathit{\text{max}}})$$
If the obtained empirical Pvalue is less than a predefined cutoff τ (say τ = 0.0001), we can conclude that the candidate model was not selected by chance and will be used as the analysis model in the next step of testing. Otherwise, Model 11 will be used as the analysis model. Step 2: Test the MFGI effect Once an optimal incompatibility model is selected, it will be used to code the incompatibility effect in a logistic regression model to assess the significance of the incompatibility effect, that is,
where G_{ m }and G_{ o }represent the maternal and offspring additive variables, respectively, which are coded as 0, 1, or 2 corresponding to aa, Aa, and AA, respectively, where A is the risk allele; and G_{ ic }is the variable of MFGI. The value of G_{ ic }depends on the selection result from Step 1. For example, if Model 1 is selected as the analysis model, then G_{ ic }= 1 for motheroffspring pairs with genotype combination (A A,Aa) and G_{ ic }= 0 otherwise. Testing the MFGI effect corresponds to testing the null hypothesis H_{0}:β_{ ic }= 0. The likelihood ratio test was applied for this purpose.
Simulation
To demonstrate that the proposed approach is valid in controlling the type I error rate and that it is statistically powerful, we conducted a series of simulations under the null and alternative hypotheses. Genotypes of N = 1,000,000 families (parents and a child) were generated in a population assuming symmetric mating and Mendelian transmission of alleles. Parental genotypes were generated by multinomial distribution with a prespecified genotype frequency. Either the HardyWeinberg equilibrium (HWE: minor allele frequency = 0.2) or the HardyWeinberg disequilibrium (HWD: genotype frequency =(0.18,0.47,0.35) for homozygous carriers, heterozygotes, and noncarriers of the minor allele, respectively) was assumed. Fetal genotypes were simulated based on parents’ genotypes following Mendelian inheritance. Paternal data were then dropped to mimic the maternalfetal study design. Binary phenotypes were simulated based on a quantitative liability variable Z=(z_{1},z_{2},⋯,z_{ N })^{T}, where z_{ i }denotes the liability variable of the i^{th}subject. A threshold was determined to ensure that disease prevalence remained at 5%. Motheroffspring pairs with the underlying quantitative liability that exceeded the threshold were “diagnosed” as affected and others as unaffected. Simulated data were treated as a population. Then samples with the size n were randomly taken for subsequent analysis.
The underlying quantitative liability trait was simulated through the following regression model (Equation (7)),
where αs are defined the same way as βs in Equation (6). Without loss of generality, we set the overall mean α = 0 and σ^{2} = 1. Performance of our proposed twostep approach (called the model selection approach) was compared with that of its nonmodel selection counterpart (called the full model approach). Quantitative data were generated using a particular MFGI model listed in Table 2, called the data generating model. Various scenarios were considered (Table 3): Scenario I assumes no genetic effect at all; Scenarios II and III generate data under the null hypothesis of no MFGI effect while allowing maternal or fetal main effect; Scenarios IVVI simulate the MFGI effect along with maternal and/or fetal main effects; and Scenarios VIIIX assume the MFGI effect only at 3 different heritability levels (h^{2} = 0.05,0.10,0.15). The effect size of incompatibility was computed as described by Parimi et al.: let ${\sigma}_{T}^{2}={\alpha}_{\mathit{\text{ic}}}^{2}q(1q)+{\sigma}^{2}$ where q is the proportion of incompatible maternalfetal genotypes in the simulated population. For a given heritability level h^{2}, we can calculate the incompatibility effect through the equation ${h}^{2}=1{\sigma}^{2}/{\sigma}_{T}^{2}$.
A case study
We illustrated the proposed method via an application to a subanalysis of a broader candidate gene study that investigates the role of genetic factors on the risk of complications of pregnancy. Details of this substudy have been previously published in a genetic association study [15]. Briefly, this casecontrol study includes patients with preterm prelabor rupture of membranes (pPROM) and their neonates and control mothers with a normal pregnancy and their neonates. Patients of Hispanic origin were enrolled in a research protocol at the Sotero del Rio Hospital, Santiago, Chile.
pPROM occurs in 3%–4.5% of pregnancies in the United States and is responsible for about 30% of preterm births [15]. Although previous studies have suggested the presence of predisposing genetic factors for pPROM [9, 10, 36, 37], the underlying genetic architecture remains unclear. SNPs in 190 candidate genes were selected and genotyped based on their possible biological roles in obstetrical diseases. We analyzed phenotypic and genotype data from the study to determine whether incompatibilities between the maternal and fetal genotypes increase the risk of pPROM. Six samples were removed because of large proportion of missing genotypes (> 50%) in either the mother sample or the offspring sample. Also, when searching across SNP markers, samples that did not follow Mendelian inheritance were excluded from the analysis. Our analysis included 742 SNPs in 190 candidate genes for 721 motheroffspring pairs (casecontrol ratio = 136:585). Maternal age which has been previously shown to be statistically significant [15] was included in the model to adjust its effect. The proposed 2step procedure and the full model approach were used to analyze data. Table 4 presents results of the analysis.The permutation procedure was handled a bit differently in the model selection step in this analysis: we calculated the maximum information gain ratio at all genomic loci across the genome for each permutation, that is, 742 values for 1 permutation; and the maximum information gain ratios for 20 permutations (a total of 742 × 20 = 14840 values) were collected and used to obtain empirical Pvalues. This reduces the computational time and allows us to address the multiplicity issue. A cutoff value of τ = 0.05 was used in the model selection step because we try to find as many true positives as possible, although the chance that we make the type I error may be slightly inflated when maternal and/or fetal main effects coexist with the MFGI effect.
Results
Simulation results
To assess the type I error rate, we simulated the phenotype under the null hypothesis of no MFGI effect. Specifically, data were generated under Scenarios IIII with sample sizes of 500 and 1000. Empirical type I error rates were estimated as the proportion of simulations with Pvalue less than 0.05 across 11,000 replicates. Overall, the test size was well controlled at the nominal level (0.05) for both approaches under all scenarios we considered. The estimates of type I error rate for the model selection approach relies on the cutoff value τ used in the model selection step. According to our simulations, the empirical type I error rate exceeds the nominal level slightly under scenarios II and III, where either maternal or fetal main effect was simulated, when a loose cutoff value of τ = 0.05 was used, the obtained empirical type I error rate is around 0.06 (detailed data not shown here). As the cutoff value gets more stringent, the obtained empirical type I error rates approaches to the nominal level. Table 5 presents results of type I error rate obtained with τ = 0.0001, which are controlled at the nominal level. The subsequent power estimates were also based on τ = 0.0001. As shown in Table 5, the type I error rate for our model selection approach are the same as that for the full model approach under most scenarios. This is because the model selection step almost always chooses the full model (Model 11) when there is no incompatibility effect, leading to the same analysis model for both approaches. There was no significant effect of HWD on type I error. Estimates of the type I error rate for scenarios under HWD are comparable to those for scenarios under HWE.Figures 1 and 2 display statistical power estimates for the proposed model selection approach and the full model approach for testing MFGI. The testing power for our model selection approach was generally higher than that for the full model approach under all the scenarios considered. This improvement was more striking for larger sample sizes. For scenarios that assume HWE, when the MFGI effect was simulated together with maternal and/or fetal main effects (Scenarios IVVI), our method improved the power, particularly when the true incompatibility model was Model 5. For example, our model selection approach had a power of 0.631 whereas the full model approach only had a power of 0.126 to detect the true MFGI effect when Model 5 was used to generate data with a sample size 1000 under Scenario IV (top right panel of Figure 1). When only the MGFI effect was simulated (Scenarios VIIIX), our model selection approach increased the testing power, especially when the underlying true incompatibility model was Model 1, 2, or 5 (bottom panels of Figure 1). The increase in testing power results from the model selection step, which can choose the true data generating model. The estimated probability of the underlying incompatibility model being selected as the analysis model by our approach approaches 1 with a heritability level of 0.1 or above. With a lower heritability level of 0.05, the estimated probability of selecting true model deceases, especially for scenarios under HWD (right panel of Figure 3). Although improvements in the testing power for HWD scenarios were not as striking as those in HWE scenarios (Figure 2), the performance of our 2step approach was still better than that of the full model approach.
Data analysis results
Table 4 summarizes results of the pPROM data analysis for the 2 approaches. It is evident from the table that our 2step approach identified MFGIs that could be missed by the full model approach. For example, a Pvalue of 0.002 was obtained for both SNPs (rs2979671 and rs3020221) in the intron 6 and exon 4 regions of the gene ANGPT2 by using our proposed approach. However, the Pvalues of 0.1450 and 0.0259 were obtained for SNPs rs2979671 and rs3020221, respectively, by using the full model approach. Model 1 was selected as the incompatibility model for SNP rs2979671 in ANGPT2. SNPs with an odds ratio (OR) less than 1 showed protective effects with the defined genotype incompatibility combinations (Table 4). Here, OR refers to the ratio of odds of developing pPROM in the two risk groups defined by the selected MFGI model. For example, SNP rs2979671 in ANGPT2 had an OR of 0.282, which implies that individuals with the mother offspring paired genotype combination (A/A, G/A) have a lower likelihood than other genotypes of developing pPROM. Such protective effects were also observed for SNPs identified in genes MGP, COL5A2, COL1A2, HLAE, and COL4A2 (ORs and CIs shown in Table 4).
In comparison, SNPs identified in genes MMP14, TNFRSF1A, AQP2, CRHR1, and GJA4 had OR greater than 1, indicating that a high risk of pPROM is possible with the motheroffspring pairs who have certain genotype incompatibility combinations defined by the corresponding selected incompatibility models. For example, SNP rs2236302 in the exon 5 region of gene MMP14, motheroffspring pairs who have the genotype combination (C/G, C/C) are at higher risk of developing pPROM: 33 of the 104 mothers in the defined “high risk” group developed pPROM whereas only 99 of 611 mothers in the “lowrisk” group developed pPROM (OR = 2.8013, 95% CI = [1.6398, 4.7854]). The confidence interval of the OR for SNP rs5743627 in gene IL10 covers 1, indicating that the MFGI effect is not marginally significant. As we are aware of, this is the first analysis that have been done which specifically investigates the genotype incompatibility effect between maternal and fetal gene that underlying pPROM. We believe that our analysis results are helpful for generating hypotheses for future studies or wet lab validations.
Discussion and conclusions
The importance of maternalfetal genotype incompatibility in human diseases, particularly in obstetrical complications, was first discussed in the 1990s [38] and has been studied intensively in recent years [16–19, 23, 24, 26]. Most of the currently available statistical methods for identifying MFGI effects fall in the framework of generalized linear regression [20–22, 25, 30]. Since the underlying MFGI mechanism is unknown and may vary for different genetic variants, it is challenging to appropriately model the incompatibility effect. The complexity largely relies on the underlying competition of 3 sets of genes: the maternallyderived fetal gene, the paternallyderived fetal gene, and the untransmitted maternal gene [39]. Conflict among the 3 sets of genes may result in an incompatibility effect, which may adversely lead to pregnancy complications such as pPROM.
A commonly used approach is to code the incompatibility effect whenever there is a disagreement between maternal and fetal genotypes [30]. However, our simulation studies show that this simple treatment ignores the underlying disease gene action modes and has potential drawbacks. When maternal and/or fetal main effects exist, the method increases the falsepositive rates for incompatibility detection. Rather than predefining an incompatibility model, herein, we propose a strategy to select an optimal incompatibility model that captures the underlying disease gene function. A model is selected as a candidate model if its entropybased measurement is the maximum among all possible incompatibility models via a permutation procedure. The candidate model is then chosen as the analytical model for further statistical tests to assess the incompatibility effect along with the maternal/fetal main genetic effects.
Intuitively, our approach will boost the statistical power by adding a MFGI model selection step. The power gain results from the fact that the true underlying incompatibility model can be selected most of the time with enough samples. We conducted extensive simulation studies, considering the effect of heritability, assumption about HWE, sample size and different disease gene functions. The results indicate that the proposed 2step strategy works well when the underlying truth is unknown compared with the full model approach. Our approach controls the type I error rate at the nominal level and achieves higher power than the full model approach without performing incompatibility model selection. Our approach does not pose strong assumptions, and its performance is quite consistent under settings such as HWE or HWD, with or without maternal and/or fetal main effects.
We applied the 2step approach to study maternalfetal genotype incompatibility effects associated with pPROM and identified several interesting SNPs. Our findings provide clues about the biological mechanism through which MFGI in these genes may have an adverse or protective effect on pPROM. Our results can be used to generate hypotheses for future biological validations to study pathogenesis of pPROM.
Overall, this method can be applied to study the maternalfetal genotype incompatibility component of obstetrical complications, such as preeclampsia and other human diseases in which maternal and fetal genetic factors interact and increase the risk of disease.
References
 1.
Slatkin M: Epigenetic inheritance and the missing heritability problem. Genetics. 2009, 182 (3): 845850. 10.1534/genetics.109.102798.
 2.
Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A, Cho JH, Guttmacher AE, Kong A, Kruglyak L, Mardis E, Rotimi CN, Slatkin M, Valle D, Whittemore AS, Boehnke M, Clark AG, Eichler EE, Gibson G, Haines JL, Mackay TF, McCarroll SA, Visscher PM: Finding the missing heritability of complex diseases. Nature. 2009, 461 (7265): 747753. 10.1038/nature08494.
 3.
Eichler EE, Flint J, Gibson G, Kong A, Leal SM, Moore JH, Nadeau JH: Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet. 2010, 11: 446450. 10.1038/nrg2809.
 4.
Lee S, Wray N, Goddard M, Visscher P: Estimating missing heritability for disease from genomewide association studies. Am J Hum Genet. 2011, 88 (3): 294305. 10.1016/j.ajhg.2011.02.002.
 5.
Zuk O, Hechter E, Sunyaev S, Lander E: The mystery of missing heritability: genetic interactions create phantom heritability. Proc Natl Acad Sci USA. 2012, 109 (4): 11931198. 10.1073/pnas.1119675109.
 6.
Treloar S, Macones G, Mitchell L, Martin N: Genetic influences on premature parturition in an Australian twin sample. Twin Res. 2000, 3 (2): 8082. 10.1375/136905200320565526.
 7.
Clausson B, Lichtenstein P, Cnattingius S: Genetic influence on birthweight and gestational length determined by studies in offspring of twins. BJOG. 2000, 107 (3): 375381. 10.1111/j.14710528.2000.tb13234.x.
 8.
GeifmanHoltzman O, Wojtowycz M, Kosmas E, Artal R: Female alloimmunization with antibodies known to cause hemolytic disease. Obstet Gynecol. 1997, 89 (2): 272275. 10.1016/S00297844(96)004346.
 9.
Menon R, Fortunato S, Thorsen P, Williams S: Genetic associations in preterm birth: a primer of marker selection, study design, and data analysis. J Soc Gynecol Investig. 2006, 13 (8): 531541. 10.1016/j.jsgi.2006.09.006.
 10.
Pennell C, Jacobsson B, Williams S, Buus R, Muglia L, Dolan S, Morken N, Ozcelik H, Lye S: PREBIC Genetics Working Group, Relton C: Genetic epidemiologic studies of preterm birth: guidelines for research. Am J Obstet Gynecol. 2007, 196 (2): 107118. 10.1016/j.ajog.2006.03.109.
 11.
Larizza D, Martinetti M, Dugoujon J, Tinelli C, Calcaterra V, Cuccia M, Salvaneschi L, Severi F: Parental GM and HLA genotypes and reduced birth weight in patients with Turner’s syndrome. J Pediatr Endocrinol Metab. 2002, 15 (8): 11831190.
 12.
Goddard KA, Tromp G, Romero R, Olson JM, Lu Q, Xu Z, Parimi N, Nien JK, Gomez R, Behnke E, Solari M, Espinoza J, Santolaya J, Chaiworapongsa T, Lenk GM, Volkenant K, Anant MK, Salisbury BA, Carr J, Lee MS, Vovis GF, Kuivaniemi H: Candidategene association study of mothers with preeclampsia, and their infants, analyzing 775 SNPs in 190 genes. Hum Hered. 2006, 63: 116.
 13.
Laivuori H: Genetic aspects of preeclampsia. Front Biosci. 2007, 12: 23722382. 10.2741/2239.
 14.
SeremakMrozikiewicz A, Drews K, WenderOzegowska E, Mrozikiewicz P: The significance of genetic polymorphisms of factor V Leiden and prothrombin in the preeclamptic Polish women. J Thromb Thrombolysis. 2010, 30: 97104. 10.1007/s1123900904321.
 15.
Romero R, Friel L, VelezEdwards D, Kusanovic J, Hassan S, MazakiTovi S, Vaisbuch E, Kim C, Erez O, Chaiworapongsa T, Pearce B, Bartlett J, Salisbury B, Anant M, Vovis G, Lee M, Gomez R, Behnke E, Oyarzun E, Tromp G, Williams S, Menon R: A genetic association study of maternal and fetal candidate genes that predispose to preterm prelabor rupture of membranes (PROM). Am J Obstet Gynecol. 2010, 203 (4): 361.e1361.e30.
 16.
Lin J, August P: Genetic thrombophilias and preeclampsia: a metaanalysis. Obstet Gynecol. 2005, 105: 182192. 10.1097/01.AOG.0000146250.85561.e9.
 17.
Sinsheimer J, Elston R, Fu W: Genegene interaction in maternal and perinatal research. J Biomed Biotechnol. 2010, 2010: 853612
 18.
Liang M, Wang X, Li J, Yang F, Fang Z, Wang L, Hu Y, Chen D: Association of combined maternalfetal TNF gene G308A genotypes with preterm delivery: a genegene interaction study. J Biomed Biotechnol. 2010, 2010: 396184
 19.
BocZalewska A, SeremakMrozikiewicz A, Barlik M, Kurzawinska G, Drews K: Contribution of maternalfetal adrenomedullin polymorphism to gestational hypertension and preedlampsia–genegene interaction pilot study. Ginekol Pol. 2012, 83 (7): 494500.
 20.
Sinsheimer J, Palmer C, Woodward J: Detecting genotype combinations that increase risk for disease: maternalfetal genotype incompatibility test. Genet Epidemiol. 2003, 24: 113. 10.1002/gepi.10211.
 21.
Chen J, Zheng H, Wilson M: Likelihood ratio tests for maternal and fetal genetic effects on obstetric complications. Genet Epidemiol. 2009, 33 (6): 526538. 10.1002/gepi.20405.
 22.
Li S, Lu Q, Fu W, Romero R, Cui Y: A regularized regression approach for dissecting genetic conflicts that increase disease risk in pregnancy. Stat Appl Genet Mol Biol. 2009, 8: Article 45
 23.
Li M, Romero R, Fu WJ, Cui YH: Mapping haplotypehaplotype interactions with adaptive LASSO. BMC Genet. 2010, 11: 79
 24.
Palmer C: Evidence for maternalfetal genotype incompatibility as a risk factor for schizophrenia. J Biomed Biotechnol. 2010, 2010: 576318
 25.
Ainsworth H, Unwin J, Jamison D, Cordell H: Investigation of maternal effects, maternalfetal interactions and parentoforigin effects (imprinting), using mothers and their offspring. Genet Epidemiol. 2011, 35: 1945. 10.1002/gepi.20547.
 26.
Li M, Erickson S, Hobbs C, Li J, Tang X, Nick T, Macleod S: Cleves M, the National Birth Defect Prevention Study: Detecting maternalfetal genotype interactions associated with conotruncal heart defects: a haplotypebased analysis with penalized logistic regression. Genet Epidemiol. 2014, 38 (3): 198208. 10.1002/gepi.21793.
 27.
Weinberg C, Wilcox A, Lie R: A loglinear approach to caseparenttriad data: assessing effects of disease genes that act either directly or through maternal effects and that may be subject to parental imprinting. Am J Hum Genet. 1998, 62 (4): 969978. 10.1086/301802.
 28.
Wilcox A, Weinberg C, Lie R: Distinguishing the effects of maternal and of fspring genes through studies of “caseparent triads”. Am J Epidemiol. 1998, 148: 893901. 10.1093/oxfordjournals.aje.a009715.
 29.
Weinberg C: Methods for detection of parentoforigin effects in genetic studies of caseparents triads. Am J Hum Genet. 1999, 63: 229235.
 30.
Parimi N, Tromp G, Kuivaniemi H, Nien J, Gomez R, Romero R, Goddard K: Analytical approaches to detect maternal/fetal genotype incompatibilities that increase risk of preeclampsia. BMC Med Genet. 2008, 9: 60
 31.
Shannon C: A mathematical theory of communication. Bell Syst Tech J. 1948, 27: 379423. 10.1002/j.15387305.1948.tb01338.x.
 32.
Zhao J, Boerwinkle E, Xiong M: An entropybased statistic for genomewide association studies. Am J Hum Genet. 2005, 77: 2740. 10.1086/431243.
 33.
Cui YH, Kang GL, Sun KL, Qian MP, Romero R, Fu WJ: Genecentric genomewide association study via entropy. Genetics. 2008, 179: 637650. 10.1534/genetics.107.082370.
 34.
Dong C, Chu X, Wang Y, Wang Y, Jin L, Shi T, Huang W, Li Y: Exploration of genegene interaction effects using entropybased methods. Eur J Hum Genet. 2008, 16 (2): 229235. 10.1038/sj.ejhg.5201921.
 35.
Wu C, Li S, Cui Y: Genetic association studies: an information content perspective. Curr Genom. 2012, 13 (7): 566573. 10.2174/138920212803251382.
 36.
Porter T, Fraser A, Hunter C, Ward R, Varner M: The risk of preterm birth across generations. Obstet Gynecol. 1997, 90: 6367. 10.1016/S00297844(97)002159.
 37.
Winkvist A, Mogren I, Högberg U: Familial patterns in birth characteristics: impact on individual and population risks. Int J Epidemiol. 1998, 27 (2): 248254. 10.1093/ije/27.2.248.
 38.
Haig D: Genetic conflicts in human pregnancy. Q Rev Biol. 1993, 68 (4): 495532. 10.1086/418300.
 39.
Haig D: Evolutionary conflicts in pregnancy and calcium metabolism  A review. Placenta. 2004, 25 (Suppl A): S10S15.
Acknowledgements
This work was supported, in part, by NSF grant DMS1209112, by the Intramural Research Program of the Eunice Kennedy Shriver National Institute of Child Health and Human Development, NIH, DHHS. and by National Natural Science Foundation of China grant 31371336.
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
SL developed the model, performed the statistical analysis, and drafted the manuscript; YC conceived the idea, participated in the model design and manuscript writing. RR collected the data. All authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Li, S., Cui, Y. & Romero, R. Entropybased selection for maternalfetal genotype incompatibility with application to preterm prelabor rupture of membranes. BMC Genet 15, 66 (2014). https://doi.org/10.1186/147121561566
Received:
Accepted:
Published:
Keywords
 Complex disease
 Pregnancy complications
 Association study
 Maternalfetal genotype incompatibility