Genetic diversity of Phytophthora infestans in the Northern Andean region

Background Phytophthora infestans (Mont.) de Bary, the causal agent of potato late blight, is responsible for tremendous crop losses worldwide. Countries in the northern part of the Andes dedicate a large proportion of the highlands to the production of potato, and more recently, solanaceous fruits such as cape gooseberry (Physalis peruviana) and tree tomato (Solanum betaceum), all of which are hosts of this oomycete. In the Andean region, P. infestans populations have been well characterized in Ecuador and Peru, but are poorly understood in Colombia and Venezuela. To understand the P. infestans population structure in the Northern part of the Andes, four nuclear regions (ITS, Ras, β-tubulin and Avr3a) and one mitochondrial (Cox1) region were analyzed in isolates of P. infestans sampled from different hosts in Colombia and Venezuela. Results Low genetic diversity was found within this sample of P. infestans isolates from crops within several regions of Colombia and Venezuela, revealing the presence of clonal populations of the pathogen in this region. We detected low frequency heterozygotes, and their distribution patterns might be a consequence of a high migration rate among populations with poor effective gene flow. Consistent genetic differentiation exists among isolates from different regions. Conclusions The results here suggest that in the Northern Andean region P. infestans is a clonal population with some within-clone variation. P. infestans populations in Venezuela reflect historic isolation that is being reinforced by a recent self-sufficiency of potato seeds. In summary, the P. infestans population is mainly shaped by migration and probably by the appearance of variants of key effectors such as Avr3a.


Background
The potato and tomato late blight pathogen, Phytophthora infestans (Mont.) de Bary [1], has a broad host range within the Solanaceae family including Solanum phureja (yellow potato), S. betaceum (tree tomato), S. quitoense (naranjilla or lulo), Physalis peruviana (cape gooseberry), and other wild species [2,3]. Since the Irish potato famine in the 19th century, this pathogen has been thoroughly studied because of its severe economic impact on agriculture, causing billion-dollar losses annually [4].
Mitochondrial and nuclear DNA regions of P. infestans have been extensively used in different regions of the world to investigate its evolutionary history and population structure [5][6][7]. This has led to the definition of lineages on the basis of molecular data, which has allowed the monitoring of populations and resulted in the generation of important epidemiological inferences about the pathogen's migration [5,6]. Additionally, population studies of P. infestans have been conducted to elucidate the origin and diversity of isolates from cultivated species [5,[8][9][10] and to determine if there are any differences between populations from wild and cultivated hosts [11][12][13].
Population studies of P. infestans in the Andean region have been focused on Peru and Ecuador. These two countries are considered to be the center of origin of potatoes, [14] and thus some authors have proposed that this region is also the center of origin of the P. infestans pathogen [8,9]. In Ecuador, the pathogen has shown low levels of diversity, and its population structure is strongly influenced by host preference. Each P. infestans clonal lineage is associated with a different host group: US-1 lineage with tomato, EC-1 with potato, EC-2 with wild solanaceous species, particularly with the Anarrhichomenum section, and EC-3 with S. betaceum [15][16][17]. Additionally, genetic differentiation is found among isolates of P. infestans associated with S. ochranthum [18]. Geographically, no clear subdivision was found in Ecuador when using allozymes and RFLP markers [15]. A different pattern was obtained in Peru, where the lineages are neither structured according to any host species [19] nor according to the origin of the hosts, be they cultivated or wild [20]. Other population studies have documented higher diversity in Peru than in Ecuador, according to the number of genotypes found [11]. According to mitochondrial haplotypes, P. infestans has been classified into four main groups: Ia, Ib, IIa and IIb [7,21,22]. Each mitochondrial haplotype has been used to complement the definition of P. infestans lineages. The US-1 lineage has been related to the Ib haplotype, the US-6 to the IIb haplotype, the EC-1 to the IIa haplotype and the EC-2 and EC-3 lineages to the Ia haplotype. Recently a new mitochondrial haplotype, Ic, has been also related to lineage EC-2, and has been correlated with a new species, P. andina [8,23].
Colombia is considered to be the fourth largest potato producer in Latin America with a production of 1.9 million tons and a cultivated surface of 110,000 hectares in 2007 [24]. However, P. infestans population studies are scarce [25,26]. These studies showed that the P. infestans population revealed low genetic diversity, with no evidence of sexual reproduction, and consisted mostly of the A1 mating type. It should be noted that the A2 mating type was recently detected, but only in one isolate [26], and a posterior intensive sampling in this site could not detect the A2 mating type again [26]. To this date, only one study has investigated the population structure of P. infestans in Venezuela [27]. Amongst the sample over this time period, only A1 mating type was reported.
This is the first study that makes use of nuclear and mitochondrial regions to conduct a population genetic analysis of the pathogen in the North Andean region. Furthermore, sequence diversity at the Avr3a avirulence (effector) gene was examined to investigate whether populations are subject to selection at this locus. Only two alleles have been described for the Avr3a locus according to virulence and avirulence against the S. demissum R3a gene [28]. Since effector genes affect pathogenicity and fitness, they may be under specific selective pressures after mutation dissemination through genetic drift, responding to host populations or other local factors. Local adaptation seems to be playing an important role in the diversification and distribution of the effectors alleles/haplotypes in natural and agricultural environments [29]. Thus this locus could be informative about the P. infestans adaptive history.
Our goals were to conduct sequencing of different genic regions for populations of P. infestans from the Northern Andes to describe the geographical distribution of diversity. We examined genes likely to be selectively neutral or under selection pressure in order to evaluate the importance of selection, drift and gene flow in specific genic regions. Additionally, these analyses are useful in comparing population parameters of the Northern Andes populations of the pathogen with other populations previously described in different regions worldwide.

Strains and Culture Conditions
A total of 80 strains from Colombia and Venezuela were included in this study (Additional files 1 and 2). Sixtyeight Colombian strains were obtained from different states, including Cundinamarca (N = 23), Antioquia (N = 18), Nariño (N = 15) and Boyacá (N = 12), isolated from S. tuberosum, S. phureja, S. lycopersicum, S. betaceum, S. quitoense, and P. peruviana. All the strains from Cundinamarca and seven from Antioquia were previously characterized using mitochondrial haplotype, metalaxyl resistance, isoenzymes analysis (Gpi and Pep) and restriction fragment length polymorphism with probe RG57 (Additional file 1) [26]. Venezuelan strains (N = 12) were selected from a collection of more than 500 strains isolated from S. tuberosum [27] from the three Andean states of Mérida, Táchira and Trujillo. All isolates were collected from commercial crops. All the isolates were identified as P. infestans sensu lato by ITS sequencing. Additionally, eight isolates from Ecuador, kindly provided by Michael Coffey, were also included for the diversity analysis. Isolates US940480 (mating type A2) and US940494 (mating type A1), provided by William E. Fry (Cornell University), were used as reference strains. In general, all isolates were cultured on rye medium and incubated at 18°C for eight days in the dark [30].

DNA Amplification and Sequencing
DNA extraction was performed as previously described [31]. Four nuclear loci, ITS, b-tubulin, Ras and Avr3a, as well as one mitochondrial gene, cytochrome c oxidase 1 (Cox1) were amplified and sequenced. For ITS, Internal Transcribed Spacer 1, 5.8 S rDNA and Internal Transcribed Spacer 2 were amplified using ITS4 and ITS5 primers yielding a 950 bp product [32]. Primers TUBUF2 and TUBUR1 were used to amplify a 989-bp portion of the b-tubulin gene, excluding introns [33].
The Ras sequence was obtained after two independent amplifications. IRF and IRR primers were used to amplify a 223-bp region corresponding to intron 1. Additionally, RASF and RASR primers amplified a 600bp product corresponding to partial exons 3 and 6, complete exons 4 and 5, and complete introns 3 to 5 [8,34]. Avr3a was obtained after modification of previously described primers [28] with an M13 tail to allow for amplification and sequencing of 453 bp corresponding to the entire gene (M13Pex F: 5'-GTAAAAC-GACGGCCAGCCATGCGTCTGGCAATTATGCT-3' and M13Pex R: 5'-CAGGAAACAGCTATGACCTG AAAACTAATATCCAGTGA-3'). COXF4N and COX R4N primers were employed to amplify 972 bp of the mitochondrial gene Cox1 [33].
For all primer combinations the reaction conditions were the following, in a final volume of 25 μl: 1× PCR buffer, 0.5 mM dNTPs, 2.5 mM MgCl2, 0.2 mM of each primer, 1U Taq DNA polymerase. PCR conditions for ITS and Ras started with an initial denaturation step of 96°C for 2 min (1 min for Ras), 35 cycles of 96°C for 1 min, 55°C for 1 min (56°C for Ras), 72°C for 2 min, and a final step of 10 min at 72°C. For b-tubulin and Cox1, the conditions started with a denaturation step of 2 min at 94°C, followed by 35 cycles of 94°C for 30 s (1 min for Cox1), 60°C for 30 s (45 s for Cox1) 72°C for 1 min and a final step of 10 min at 72°C. For Avr3a, the amplification conditions included a denaturation step of 2 min at 94°C followed by 15 cycles of 30 s at 94°C, 30 s at 55°C and 1 min at 72°C. Twenty-five additional cycles of 30 s at 94°C, 30 s at 62°C and 1 min at 72°C were added. The final extension was performed for 10 min at 72°C. The amplification products were separated on 1% agarose gels and visualized by staining with ethidium bromide. Single band PCR products were sequenced in Macrogen, Korea. Sequence assembly and editing was performed manually on the CLC DNA Workbench http://www.clcbio.com. Sequence alignments were performed using MUSCLE [35]. Due to the high heterozygosity condition of P. infestans, double peaks were assigned with the IUPAC code when necessary.
Haplotype reconstruction was conducted using DNAsp v4.90.1 [36] implementing the algorithm provided in PHASE [37]. Basically, PHASE assigns a probability of the correct inference of haplotype phase at every heterozygous position. PHASE simulations were repeated four times for each locus, two without recombination and two with recombination. Each simulation was run with 5,000 iterations. In all cases, the most common output inferring haplotypes with >95% confidence was accepted. All the haplotypes sequences were deposited in GeneBank under accession numbers [Gen-Bank: GU258154 and GU258156] for Ras, [GenBank: GU258157-GU258165; GU258167, GU258168] for b-tubulin, [GenBank: GU258058 and GU258059] for Cox1 and [GenBank:GU258052-GU258057] for Avr3a. Additionally, sequences for ITS were deposited for all the strains employed under accession numbers [GenBank:GU258061-GU258072, GU258074-GU258077, GU258080-GU2580125, GU2580127-GU2580152].

Gene Networks and Population Genetic Analyses
Different datasets were generated for each DNA region. Additional sequences of ITS, b-tubulin, Ras and Cox1, from P. infestans isolates collected worldwide were retrieved from GenBank and included in the analyses (additional file 3). The genealogical relationships among haplotypes were established by statistical parsimony with a 95% connection limit as is implemented in the TCS software version 1.21 [38].
Nucleotide diversity (π), nucleotide substitution rate (θ) and haplotypic diversity (HD) were calculated for each data set from the Northern Andean region, including Colombia and Venezuela (NA) and for the rest of the world, corresponding to available sequences from other geographical origin (R) using DnaSP v.4.90.1 [36]. For the worldwide data set (W) the combination of the NA and R data sets was used (W = NA +R. The population genetics analyses were performed only with the NA dataset. To test the hypothesis of genetic structure (complete Panmixia vs. geographically structured populations) among these populations, a permutation test for Hudson's statistics [39] was performed with SNAP Workbench [40]. Sequences were collapsed into haplotypes recoding indels and excluding infinite-sites violations using MAP TOOL [40]. Then, a distance matrix was generated with SEQTOMATRIX [40]. This matrix was used to perform a non-parametric permutation test with PERMTEST [40] under default parameters.
Tajima's D [41] was used as a test of neutral evolution for β-tubulin, Ras, Avr3a and Cox1 with DnaSP v.4.90.1 [36]. A mismatch distribution analysis was carried out with the statistic R2 [42] to establish population size changes as is implemented in DnaSP v.4.90.1 [36]. To assess a confidence interval for R2 value observed, a coalescent simulation with 1000 replicates was run with theta (θ) and no recombination as starting parameters.
The software SITES [43] was used to estimate the number of fixed differences or shared polymorphisms among the defined populations in the NA dataset, using b-tubulin, Ras and Cox1. Fixed differences would be indicative of isolation, while shared polymorphisms would indicate gene flow. MIGRATE-n [44] was used to estimate theta and the direction and amount of gene flow among these populations, with nuclear (b-tubulin, Ras) and mitochondrial (Cox1) genes. In each case, 20 short chains with 5000 sampled genealogies and five short chains with 50000 genealogies were run. Heating was set to be active with four temperatures (1.0, 1.5, 2.5 and 3.0).
A pairwise comparison of the rates of nonsynonymous nucleotide substitutions per nonsynonymous site (dN) and synonymous nucleotide substitutions per synonymous site (dS) were estimated for Avr3a using the approximate method of Nei and Gojobori (1986) [45] implemented in the YN00 program in PAML [46]. CODEML was employed to identify the amino acids under strong diversifying selection. Avr3a sequences were translated into amino acids and aligned using P. sojae [GenBank: EF463064] as an outgroup. Details of these tests are explained elsewhere [47].

Summary statistics and Gene Networks
Tables 1 and 2 contain the summary statistics for the analyzed sequences and haplotype information. The number of sequences analyzed differed for each gene, due to challenges with the amplification/sequencing of specific isolates or availability of previously published data ( Table 1). The ITS region showed no polymorphism in any of the 755 sites analyzed and was excluded from further analyses (and from Table 1). The Northern Andean region (Venezuela and Colombia) showed low levels of nucleotide diversity and genetic variability ( Table 1). The regional distribution of the haplotypes is shown in Figure 1. The highest number of haplotypes was observed for b-tubulin. The haplotype most frequently found (H1) was present in Colombia and Venezuela. Venezuelan P. infestans showed only two haplotypes, H1 and H8. The Central Andes showed four haplotypes, one of them restricted to this zone (H10). In the Eastern Andes three haplotypes were found for b-tubulin, and H11 was only found in this region. The Southwestern Andes showed the highest number of different haplotypes for this gene (9), five of them (H2, H3, H4, H5 and H6) restricted to this area. For Ras, haplotype the same nomenclature was used as the one in Gomez-Alpizar et al., 2007 [8] for comparison purposes. The Ras region showed two haplotypes (H2 and H13) with only one haplotype (H2) found in the Southwestern, Central and Eastern Andes. Venezuelan isolates were heterozygotic (H2/H13) and the H13 haplotype was exclusive to this country. Cox1 showed two haplotypes (H1 and H5), and H1 was present in all regions. All isolates from the Central Andes belonged to only this haplotype, which was also the most common one in the Southwestern and Eastern Andes. H5 was the most common haplotype in Venezuela (Figure 1). For Avr3a, four haplotypes were found. The most frequent allele, H1, was present in Colombia and Venezuela. H5 and H6 were only found in the Southwestern Andes. H2 was only found in the Eastern Andes. All isolates from S. tuberosum corresponded to haplotype H1, which was the most common haplotype. In S. betaceum the main haplotypes were H1, H2, H5 and H6 (Additional file 1).
Gene Networks were constructed for each Northern Andean region and for Colombia and Venezuela when more than one haplotype was found, in order to determine the evolutionary relationship among them (Additional files 4 and 5). As a general pattern, nuclear and mitochondrial haplotypes present in the Eastern and Central Andes as well as in Venezuela were only separated from each other by one or two steps. However, in the Southwestern Andes a different pattern was observed for b-tubulin and Avr3a. Three haplotypes were found for Avr3a: H1, H5 and H6. H5 and H6 were six and nine steps distant from H1, respectively. For b-tubulin, nine haplotypes were found and only five of them were related to each other by one step. More than two steps separated the other four haplotypes from each other.
Nucleotide diversity (π) and substitution rate were low in all genes analyzed in the R and W datasets (Table 1). Again, datasets were designated NA for Colombia and Venezuela, R for the rest of the world, which corresponds to available sequences from other geographical origins, and W for worldwide, corresponding to the combination of the two previous datasets (W = NA +R). The Northern Andean region showed the same pattern of low nucleotide diversity and genetic variability as the rest of the world population (Table 1). Haplotypic diversities were high as a consequence of heterozygous individuals showing alleles with low frequency. b-tubulin showed the highest number of haplotypes, followed by Avr3a, Ras and Cox1 (Table 1 and Figure 2). The Northern Andean region shared with the rest of the world three haplotypes for b-tubulin (H1, H7, H9), two haplotypes for Cox1 (H1 and H5), and just one haplotype for Ras (H2) and Avr3a (H1).
The estimated gene networks for the W dataset ( Figure 2) showed as a general pattern a dominant haplotype separated by few steps from the others. The network representing the b-tubulin haplotypes showed a dominant haplotype, H1, present in Central America and South America (Figure 2A). H7 was present in Central America, South America, North America, and Africa. H9 was found in North and South America, as well as in Europe (The Netherlands). The remaining haplotypes were restricted to South America. The most divergent clade was represented by isolates from S. betaceum (data not shown). For Cox1, two haplotypes (H1 and H5) were widely distributed among regions and connected directly in the network. The three remaining haplotypes were derived from H5; two of them were restricted to Central America (Mexico) and one, H4, to South America (Ecuador) ( Figure 2B). For the Ras gene, a dominant haplotype H2 was present in North America, Central America and South America, as well as in Europe (Ireland) ( Figure 2C). H2 was mainly associated with S. tuberosum, although this species also hosted other haplotypes. H13 was only found in South America (Venezuela).

Population Genetic Analyses
To test the hypothesis of population subdivision, isolates were first assigned to populations following their geographical distribution on a regional scale. Then, permutation tests for Hudson's statistics were performed for all combinations among regions (Table 3). Significant genetic differentiation was found for Ras and Cox1, between the Eastern Andes and Venezuela and between the Central Andes and Venezuela (Table 4). For btubulin, population subdivision was found between the Central and Eastern Andes and between the Southwestern and Eastern Andes (Table 3). Avr3a showed no differentiation according to geographical location.
Tajima's D test failed to reject the null hypothesis of neutral evolution in Ras, b-tubulin and Cox1 (Table 4). This indicates that, for these loci, the Northern Andean population has not been subject to selection pressure.

Consensus A T A A C A A G G A T C C G A A A G G G G T G T T C C C A C T G C T T C A T G A G C Site Type v v v v t t t t v t v v t v v t t t v t v t t t v t t t t t t t t v t t v v v t v t
Character Type However, the Avr3a gene was found to be subject to a selection pressure. A negative tendency was observed in Avr3a and b-tubulin indicating a possible population expansion or a recent bottleneck effect. The R2 test failed to reject the null hypothesis of constant population size in the NA populations. The L shape pattern found in the mismatch distribution graphs of all genes (Additional file 6) is also indicative of a constant population size. No fixed differences were observed in any of the defined populations in the NA dataset (data not shown). The analyses of gene flow were in general concordant between nuclear and mitochondrial genes ( Table 5). Nuclear and mitochondrial genes showed low levels of genetic flow between Colombia and Venezuela. Both nuclear and mitochondrial genes showed that the Central and Eastern Andes are interchanging migrants.

Avr3a Amino Acid Analysis
Six different amino acid sequences were obtained after the translation of the nucleotide sequences from the W dataset ( Figure 3). H1 was characterized by amino acids E80 and M103 and is associated with a virulent phenotype. H3, characterized by amino acids K80 and I103 and associated with an avirulent phenotype, was exclusively found in Ecuador, as was H4. To identify which amino acid positions were under diversifying selection, three pairs of ML models of amino acid substitutions were tested. Each pair consisted of one model that allowed sites to be under diversifying selection with ω > 1 (M2, M3 and M8) and a second model that did not allow sites to be under selection (M0, M1 and M7) with ω= dN/dS. The likelihood ratio tests failed to reject the models of neutral evolution. However, model M8 accounted for a small percentage (2%) of amino acids under selection pressure (ω >1). These amino acid positions were S19C, E80K and M103I.

Discussion
This is the first time that a detailed population genetic analysis of P. infestans in the northern Andean region was conducted. Until now, the pathogen population structure was unknown in this portion of the continent (Colombia and Venezuela). Colombia is situated in a critical geographical position, acting as a bridge between Central and South America. According to FAO statistics, this country has a commercial history in trading potatoes with several countries in Europe, Asia, North and South America, and the Caribbean [24]. This suggests there have been opportunities for migration of the pathogen, along with its host, towards and within the northern Andean region.
The low genetic diversity found on global and regional scales in the Northern Andean region (Colombia and Venezuela), in both mitochondrial and nuclear regions, is consistent with previous reports in other countries of South America [8]. Even for microsatellite markers low levels of genetic diversity have been reported in Colombia and Venezuela [25][26][27]. Specifically, for Ras, only 2 out of 13 haplotypes found globally could be observed in the region. The A2 mating type was discovered in Ecuador and Colombia [16,26], and sexual reproduction is expected to produce new allele combinations. However, even though the A1 and A2 mating types converge in the same geographical region, as occurred in Cundinamarca Department (Colombia), sexual recombination is apparently not prevalent. This is probably due to a process of host adaptation [11,16,48], or because the presence of the A2 mating type is too recent to have resulted in recombination. Additionally, the low frequency of the A2 mating type described in Colombia diminishes the chances for sexual reproduction to occur [26]. Indeed, in the nuclear genes few individuals were heterozygous, suggesting that genetic interchange occurs but at extremely low frequency. Because of this, the presence of heterozygous individuals might be the result of ancestral polymorphisms or gene flow and not the result of sexual reproduction.
The gene networks for the different regions showed, as a general pattern, that the relationships between the haplotypes in each population of the Northern Andean region are consistent with variation within a clonal population and that haplotypes are discriminated by one or two steps. However, in the Southern Andes, at least for b-tubulin and Avr3a, more than two steps separated four and two haplotypes, respectively. The more divergent haplotypes were found in samples from S. betaceum and corresponded to the mitochondrial haplotype Ia. Recently it has been suggested that a new species, P. andina, with an Ia mitochondrial haplotype, could be associated with S. betaceum [8,23,49]. However, its taxonomic status has not been clarified yet, and continues to be considered as P. infestans sensu lato. The presence of this new species in the population of the southwestern Andes could explain the different pattern observed in Avr3a and b-tubulin.
The genetic variation of P. infestans is mainly explained by the variability present within each population that is generated by the existence of different low-frequency alleles, particularly for b-tubulin and Avr3a in the Southern Andes. Nevertheless, the variation detected in this study was enough to show some genetic differentiation. At the regional level, the P. infestans population in Venezuela appeared to be isolated from the Central and Eastern Andes populations. Indeed, at the mitochondrial level Venezuelan isolates belonged largely (11 out of 12) to the Ia haplotype, and just one corresponded to IIa, the most common haplotype in the North Andean region. This and the presence of a particular haplotypic composition for the nuclear genes may suggest that the Venezuelan population probably has been structured by different population events. Reasons for the apparent genetic isolation of the Venezuelan isolates of P. infestans have been discussed elsewhere [27]. The estimates of genetic flow support the idea that the population of Venezuela is not donating or receiving migrants from any other region (Table 5). However, a large area has not been sampled on the Colombian side close to the Venezuelan border. More sampling is therefore needed in this region in order to detect possible recent gene flow.
Additionally, two significant patterns were observed. First, subdivision was found between the Eastern Andes with the rest of Colombia for b-tubulin. This could be the result of a long history of self-sufficiency in terms of potato seed tuber supply in the eastern Andes as well as avoiding movement of the pathogen on plant tissue. Second, results obtained with the mitochondrial gene Cox1 and nuclear genes suggested that historically the Southwestern and the Eastern P. infestans populations have been isolated, but recent gene flow could be taking place.
The selection imposed on a gene with a known function in host recognition as Avr3a may have resulted in a different pattern of sequence polymorphisms in the sampled population in comparison to the other nuclear regions analyzed. However, low genetic diversity was found for this gene. No genetic subdivision could be detected in the Northern Andean region, in contrast with what was found for b-tubulin and Ras. At the amino acid level interesting patterns emerged. Previous  studies reported only three polymorphic positions in a sampling of isolates from S. tuberosum from different locations worldwide. Two alleles were characterized based on those amino acid positions, C19 K80 I103 and S19 E80 M103, with avirulent and virulent phenotypes, respectively [28]. Here we report four new allelic variants. Three were only present in the isolates from the southwestern Andes. The natural selection analysis at the amino acid level showed that three out of the nine polymorphic positions were under diversifying selection and two of these were located in the C-terminus region as previously reported [28]. According to the model of interaction of the virulence genes with the host cell, at least two mechanisms could be affected by amino acid substitutions. The first one is the direct or indirect recognition by the resistance protein from the host. Bos et al. [50] have shown that some mutations leading to specific amino acid substitutions at the position E80K of Migrations estimates correspond to migrants per generation (Nm) and 5% and 95%, confidence intervals are reported as are implemented in MIGRATE [45]. For each population the estimates correspond to the nuclear genes (above) and the mitochondrial gene (below). this protein can be associated with a loss of recognition by the R3a gene product. On the other hand, the replacement of lysine by arginine at the same position of the protein does not affect the cell-death suppression activity of AVR3a [50]. It has been recognized that the variant AVR3aEM produces significantly less hypersensitive response than the other known variants of the gene, producing a virulent phenotype [28], while the AVR3aKI variant is the most effective in suppressing cell-death in the host [51]. This scenario leads to the possible advantage of maintaining polymorphic residues, which may give the pathogen population different adaptive pathways. The remaining polymorphic positions detected in this study were not under strong positive selection. However, their effect on avirulence on S. tuberosum R3-expressing plants remains to be determined in order to define their potential effect on host resistance at the subspecific level.
Polymorphisms of the Avr3a gene may be related to the plant species from which the isolates were collected. Armstrong et al. [28] reported that 55 isolates of S. tuberosum from different locations in the world contained only two alleles. In this study we found both previously reported haplotypes AVR3aEM (H1) in S. tuberosum as well as in other Solanum species, and AVR3aKI (H3) restricted to S. muricatum. The other four reported variants were not present in S. tuberosum. Because of this, it is required to establish the correlation of each of these variants and their host range. Additionally, the taxonomic status of the P. infestans-related species P. andina should be clarified in order to understand the real sympatry of the two species and their respective contribution of alleles to the population. Furthermore, a more intensive sampling is needed in order to correlate polymorphisms with the host. Finally, the wider picture of the population structure in this region suggests that the low steps in the networks, together with the low genetic diversity of this pathogen and the evidence of the presence of selection in the effector gene, might be consistent with a scenario of random dispersion of new mutations by drift that in some cases might be fixed by selection. We need further research in other populations and more data from effector genes and neutral loci to really understand the Phytophthora infestans diversification process in South America.