The search for regions in the genome containing genetic variants that affect production traits requires experimental populations to identify the segregating QTL within and between parental populations . The F2 design is commonly used to map QTL segregating in divergent parental lines [2, 3]. To produce reliable analyses of association or genetic evaluations using genomic information, a great number of individuals with phenotypes and high density (HD) genotypes are required . However, HD genotypes for large numbers of animals are expensive to obtain [5, 6]. A way of reducing cost is to genotype individuals from base generations (parents) in HD, and their more numerous descendants at low density (LowD) [6, 7]. Then, using selected SNP from the HD panel, called tagSNP, the non-typed SNP are imputed with high accuracy . Imputing HD genotypes of progeny from LowD genotypes, conditional on grandparental and parental HD genotypes, may result in higher imputation accuracies than those obtained using a reference panel from unrelated individuals [7–9]. This is because HD genotypes from base generations can be traced within family by means of co-segregation or descendant probabilities  while searching for the phase of parental alleles .
Most studies on genotype imputation of livestock species have been performed with purebreds [4, 7, 9–13], and genotype imputation from crossbreds has been largely absent. With regard to agricultural plant species, studies on genotype imputation have used inbred lines , recombinant inbred lines (RILs) in Nested Association Mapping (NAM) designs [14, 15], and Multiparent Advanced Generation Inter-Cross studies (MAGIC) . Genotype imputation has also been employed in human studies of genome-linkage analysis for test association of candidate transcriptional regulators with gene expression ; and also in a model organism in biomedical research such as the mouse, imputation of genotypes from crosses of inbred lines was used to identify candidates genes for complex disease [18, 19].
Imputing genotypes in humans, plants, livestock, or model organisms, is similar in the sense that a small number of founding individuals can be genotyped at high density, and the bulk of the mapping population can be genotyped at low density using linkage information. In this paper we focus on imputing F2 individuals from a three generation (F0, F1 and F2) population of Duroc × Pietrain crossbred pigs. The F0 and F1 animals were genotyped in HD (60K). The F0 populations used to map QTL in pigs are typically composed of a small number of animals (in our case, 4 males and 15 females) [1, 20–22]. As it is expected that few recombinations occur in the first generations, these populations have low resolution to map QTL . However, and for the same reason, there is a potential for attaining high accuracy of imputation. The latter effect can be taken to advantage for imputing HD genotypes from inexpensive LowD F2 genotypes, which subsequently allows combining existing data from experimental populations in a meta-analysis for association. There are several reasons for this strategy to be attractive. First, several of these populations have been recently created [21, 22, 24, 25] and DNA from these animals is available. Second, extensive datasets of phenotypes have been recorded for these populations including for traits that are expensive or difficult to measure, such as the content of intramuscular fat and composition of fatty acids , age at puberty in gilts , and meat tenderness . Finally, these populations are generally developed from breeds that are divergent for some traits of interest such as fat/lean content, meat quality or reproductive efficiency, take for example: Duroc × Pietrain [1, 21], Duroc × Landrace , Duroc × Large-White , White-Duroc × Erhualian , Meishan × Duroc , Berkshire × Duroc .
Therefore, it follows that imputation of F2 LowD to HD genotypes with high accuracy would be useful and convenient, providing a cost effective strategy as a first step for association analyses or meta-analyses. Different methods have been employed to select tagSNP in LowD panels. Two of them are: 1) imposing restrictions on the minimum value of linkage disequilibrium (LD) or r
2 between markers , 2) selection of tagSNP that are evenly spaced using the physical distance between markers[4, 11, 12]. In addition, commercial chips are also available with medium density segregating SNP selected from several populations, as for example for bovine  and pig . A question arises of how many SNP are needed to attain a high accuracy of imputation for a given F2 population. Another question is whether a specific chip has to be custom designed, or whether current commercially available chips can be used. Finally, it is important to determine whether both the F0 and F1 have to be genotyped at HD, or if just genotyping the F0 is adequate to obtain a high accuracy of imputation in the F2.
The goal of this research was to estimate the accuracy of imputation at HD (60K), from LowD F2 genotypes for a Duroc × Pietrain population, using different genotyping schemes. The strategies were evaluated by means of Monte Carlo simulation, conditional on the genotypes from animals in the first two generations (F0 and F1). In doing so, two methods of tagSNP selection were considered and their results were compared to those obtained from a commercial panel chip (9K). In addition to simulations, accuracy of imputation was evaluated using experimental data, taking advantage of a reduced number of F2 animals that were genotyped at HD.