With the advance of genotyping techniques, genome-wide association analysis has become the mainstream technique in genetic mapping. However, studies have shown that using information from linkage scans can improve the power of association mapping in genome scans [1]. In addition, linkage analysis could be more powerful than association analysis for some genetic mechanisms; family data can also help to estimate familial risks [2]. Hence, linkage analysis remains a useful and supplemental tool to map genes for complex diseases. As complex diseases often involve quantitative biomarkers or environmental factors, incorporating these quantitative factors into linkage mapping can improve the power to detect disease loci [3] or the efficiency of estimating disease loci. Efficiency is defined as the inverse of the variance estimate for the disease locus estimate. Thus, smaller variance estimates have higher efficiencies. Moreover, the incorporation of covariates provides information that can be used to characterize disease loci, which is important for understanding disease etiologies and mechanisms and for identifying population subgroups that may have particularly high disease risks [4]. Methodologic work has demonstrated that failure to adequately account for gene-covariate interaction in a genetic analysis can mask the effects of both genes and covariates [5–7]. Hence, it is important to develop linkage approaches that allow the inclusion of covariates.

Thus far, several linkage analyses including covariates have been proposed to account for linkage heterogeneity or to examine biological, environmental, gene-gene or gene-environment interaction effects. Devlin (2002) [5] accounts for linkage heterogeneity by incorporating a family-level covariate into likelihood-based mixture models; however, this approach accounts for linkage heterogeneity only. Greenwood and Greenwood (1997, 1999) [6, 8] incorporated covariates into genome scanning approaches using sib-pair or relative-pair through model-based logarithms of odds (LOD) score approaches, where the generalized expected identity-by-descent (IBD) sharing was modeled as a function of some covariates through multinomial logistic regression. Rice (1999) [7] applied a novel technique to detect significant covariates in linkage analyses with a logistic regression approach using all sib pairs (concordant affected, concordant unaffected, and discordant), and Saccone et al. (2001) [9] further extended this analysis to cousin pairs. Olson (1999) [10] proposed a unified framework for model-free linkage analysis that can handle the separate inclusion of other ARPs, discordant relative pairs, covariates, or additional disease loci through a conditional-logistic parameterization. These regression-based approaches can easily be generalized to include all covariates; however, they assume either one disease locus or multiple unlinked loci and thus are not applicable to analyses of multiple linked loci. For non-regression-based approaches, Hauser et al. (2004) [11] proposed a model-free LOD scores approach that includes family-level covariate information. This approach also assumes only one disease locus and can only incorporate one covariate at a time. In addition, the problem of multiple testing may arise when researchers perform multiple tests or analyses using various combinations of multiple loci or covariates using these approaches.

On the other hand, most two-locus linkage approaches aim to detect the presence of a second susceptibility gene by accounting for the effects of a known susceptibility gene [12–14]. However, when two susceptibility loci are linked, the location of the first gene may be inaccurate because it was mapped without accounting for the effects of the linked gene. Thus, conditional analyses that rely on an inaccurate position for the first locus may result in an inaccurate second disease loci estimate as well. Biswas et al. (2003) [15] applied a Bayesian approach to simultaneously detect two linked disease genes; however, their approach was designed to detect genes under locus heterogeneity only, and this model-based approach requires the specification of unknown genetic parameters. Hence, linkage approaches that can simultaneously localize two linked disease genes are in great demand.

Rather than testing the presence of linkage, Liang et al. (2001) [16] developed a novel, robust, model-free multipoint linkage method that simultaneously estimates both the position of a disease locus as well as its effect on the disease, along with its sampling uncertainty. The advantages of this method include: (i) It does not require specification of an underlying genetic model; hence, estimation of the parameters is robust to a wide variety of genetic mechanisms. (ii) The multiple testing issue is eliminated as a single test statistic is provided for linkage in the entire studied region; rather than testing the hypothesis for one marker at a time. (iii) While multiple markers are incorporated simultaneously in the gene mapping, there is no need to specify the phase of genotypic data with multiple markers. Many complex diseases, such as hypertension, schizophrenia, diabetes, and asthma are usually defined as dichotomous phenotypic traits; however, they are also associated with quantitative biological markers or quantitative risk factors. As a result, Glidden et al. (2003) [17] further incorporated quantitative covariates into Liang's approach [16] and estimated the genetic effect of a disease locus through a logistic-type parametric model using affected sib pairs (ASPs). Based on the same study design, Chiou et al. (2005) [18] incorporated quantitative covariates into their linkage mapping and estimated the genetic effect of a disease locus non-parametrically. This quantitative covariate could be either an environmental risk factor or itself a quantitative trait. For the quantitative trait incorporated as a covariate, its QTL (quantitative trait locus) may directly underlie a pathway of the disease or be linked to the disease locus, or the trait may be indirectly associated with the disease.

Meanwhile, Schaid et al. (2005) [19] extended the without-a-covariate approach by Liang et al. [16] to different types of ARPs. The authors' extension relaxed the limitation to ASPs only and allowed an investigator to study the risk-ratios of a disease gene estimated from multiple relative pairs; this work helped to uncover the underlying genetic mechanism of disease. To jointly localize two linked disease loci using ASP data, Biernacka et al. (2005) [20] extended this approach [16] to the localization of two linked disease-susceptibility genes. They also provided tests for the presence of two linked disease-susceptibility genes by a quasi-likelihood ratio test and a modified score test in another article [21]. Lin and Schaid (2007) [22] generalized the two-locus localization method to a variety of ARPs. Both of the unconstrained and constrained models, along with a score test and the examination of the goodness of fit of a used constrained model, were described in their generalized method. As the etiology of complex diseases often involves quantitative variables (either genetic biomarkers or environmental factors) in addition to multiple disease loci, it is helpful to incorporate a quantitative variable while localizing two linked disease loci simultaneously using ARPs. We extended Lin and Schaid's (2007) [22] approach to incorporate quantitative covariates in two-locus linkage mapping using ARPs. Generally, a statistical parametric model is simpler and easier to interpret than a non-parametric model, while a non-parametric model has the flexibility to fit the data perfectly. To take advantages of parametric and non-parametric statistical models, we applied both models to incorporate covariates. These methods can also be applied to account for heterogeneity from quantitative covariates as well as from multiple subgroups that are stratified by categorical covariates. Systematic simulation studies under a variety of quantitative covariates were conducted to evaluate the gain in efficiency of estimating the disease loci from the proposed methods. The estimates from the proposed approaches with incorporation of covariates were compared with those from the approach without incorporating covariates. The collaborative study on the genetics of alcoholism (COGA) data released for GAW14 was used to illustrate the proposed approaches.