No genetic erosion after five generations for Impatiens glandulifera populations across the invaded range in Europe

Background The observation that many alien species become invasive despite low genetic diversity has long been considered the ‘genetic paradox’ in invasion biology. This paradox is often resolved through the temporal buildup genetic diversity through multiple introduction events. These temporal dynamics in genetic diversity are especially important for annual invasive plants that lack a persistent seed bank, for which population persistence is strongly dependent on consecutive seed ‘re-establishment’ in each growing season. Theory predicts that the number of seeds during re-establishment, and the levels of among-population gene flow can strongly affect recolonization dynamics, resulting in either an erosion or build-up of population genetic diversity through time. This study focuses on temporal changes in the population genetic structure of the annual invasive plant Impatiens glandulifera across Europe. We resampled 13 populations in 6 regions along a 1600 km long latitudinal gradient from northern France to central Norway after 5 years, and assessed population genetic diversity with 9 microsatellite markers. Results Our study suggests sufficiently high numbers of genetically diverse founders during population re-establishment, which prevent the erosion of local genetic diversity. We furthermore observe that I. glandulifera experiences significant among-population gene flow, gradually resulting in higher genetic diversity and lower overall genetic differentiation through time. Nonetheless, moderate founder effects concerning population genetic composition (allele frequencies) were evident, especially for smaller populations. Despite the initially low genetic diversity, this species seems to be successful at persisting across its invaded range, and will likely continue to build up higher genetic diversity at the local scale. Electronic supplementary material The online version of this article (10.1186/s12863-019-0721-4) contains supplementary material, which is available to authorized users.


Background
The number of invasive alien species continues to increase across the globe [1,2]. Consequently, much research has focused on understanding the population genetic processes underlying the successful establishment and spread of invasive alien species outside of their native range [3][4][5]. This work has clearly shown that during the invasion process, many alien species obtain relatively low levels of genetic diversity [3,6,7]. This low genetic diversity is directly caused by the often small number of initial colonists, thus introducing only a small subset of the genetic diversity present in the native range [5,8]. These genetically poor and small initial populations are further subjected to strong genetic drift or founder effects during the early stages of the invasion process, which may further erode genetic diversity and hamper the invasion success on longer timescales [3,5,9].
Range expansion of the invasive species into new areas can result in additional sequential founder effects, bottlenecks and increased genetic drift, further eroding genetic diversity [4,10]. Several invasive species furthermore seem to show boom-bust dynamics, i.e. the rise of populations to outbreak levels, followed by a dramatic decline, suggesting that these potentially genetically poor populations can crash when certain local selective pressures shift [11,12]. The observation, however, that many alien species become invasive, despite their expected bottlenecked populations, low genetic diversity and low evolutionary potential, has long been considered the 'genetic paradox' in invasion biology [3,13]. Several chronosequence-based studies have, however, shown that the genetic paradox is often 'resolved' through the buildup of higher genetic diversity following multiple introduction events from the native range [6,10,13,14]. This clearly illustrates how temporal dynamics affect genetic diversity patterns of invasive aliens species after initial invasion, which can, in turn, determine the long-term success of these species in their invaded range [3]. Furthermore, since not only the invasive species' fitness and population persistence, but also its long-term ecosystem impacts and success of potential eradication or control measures are dependent on its population genetic diversity, it is important to understand how these temporal dynamics will affect its population genetic structure [3,15,16]. Indeed, if eradication methods can optimize reduction in genetic diversity, the species' persistence and spread may be minimized due to increased genetic drift effects [5]. Nevertheless, we know very little about the temporal dynamics of population genetic diversity of invasive species, and repeated sampling of the same populations has rarely been done ( [16], however see [17][18][19][20]).
These temporal dynamics in genetic diversity are particularly important for annual alien invasive (plant) species, where long-term population persistence is strongly dependent on successful seed establishment in each growing season. Depending on the seedling recruitment success (colonists) and the efficiency of long-distance seed dispersal (i.e. an influx of migrants), these re-establishment dynamics could result in sequential genetic founder events and genetic bottlenecks [21,22]. Indeed, theory predicts that if re-establishment is effectuated by a limited number of individuals (seeds), and only limited among-population gene flow occurs ('propagule pool' colonization model sensu [23]), these recolonization dynamics will result in erosion of population genetic diversity and inflation of among-population genetic differentiation through time [9,17,24]. Alternatively, if sufficiently high seedling recruitment and high (long-distance) gene flow occur during population re-establishment, these consecutive founder events might retain reasonably high levels of genetic diversity ('migrant pool' colonization model sensu [23]). This could even result in a gradual increase in genetic diversity and decrease in across-population genetic differentiation, potentially resulting in stabilization of local population sizes and increase of the overall invasion success of the annual species across its invaded range [17].
Here we focus on the annual invasive alien plant Impatiens glandulifera Royle (Balsaminaceae) (2n = 18 or 20). This species was originally introduced to Europe in the 1800s as an ornamental plant from the western Himalayas [25] and subsequently colonized riparian habitats across its invaded range from southern Spain (37°N) to northern Norway (70°N) [25,26]. The species is highly competitive and can affect several ecosystem functions, such as nutrient cycling and soil erosion control [27,28]. Although I. glandulifera can form large populations, the species has strongly fluctuating annual population sizes [17,29]. These temporal fluctuations in population size and population persistence are mainly caused by the species' annual lifecycle and the absence of a persistent seed bank [25]. Previous research has observed local adaptation of several life-history traits in this species [30], suggesting sufficiently high genetic diversity (however see [31]). This anticipated high genetic diversity is further supported by the expectation of substantial gene flow within and across populations through both hydrochorous dispersed seeds [32,33] and pollen [34]. Other studies have nevertheless observed genetically impoverished I. glandulifera populations across several parts of its invaded range [7,35]. Similarly, a recent study showed relatively high genetic differentiation of I. glandulifera populations both within and across river catchments in the UK, suggesting founder/drift dynamics due to sequential population re-establishment under limited gene flow [17]. These contradicting genetic results suggest that the temporal genetic dynamics are complex in this species. However, these studies did not evaluate temporal dynamics in population genetic diversity.
In this study, we resampled 13 I. glandulifera populations, ten of which were studied in [7], to assess changes in the neutral genetic diversity 5 years after the initial sampling. These populations are distributed across six study regions along a 1600 km long latitudinal gradient in Europe, ranging from Amiens (France) in the south to Trondheim (Norway) in the north. We expect that, if these populations have experienced strong sequential founder effects in the 5 years between sampling years due to increased genetic drift and low gene flow levels, population level genetic diversity will have decreased, and among-population genetic differentiation will have increased. Alternatively, if gene flow retained sufficient levels and sequential population re-establishment was effectuated through genetically diverse founders, we expect population genetic diversity and among-population genetic differentiation to remain constant or, even increase, respectively decrease. We furthermore expect these potential population genetic changes to be dependent on population size, with much stronger potential shifts in genetic diversity and genetic differentiation in small populations.

Genetic diversity
Population sizes have decreased for almost all populations between the 2011 and 2016 sampling years (

Genetic bottlenecks and effective population size
We found evidence for a small heterozygosity excess, indicating recent bottlenecks in population Bremen 2 (Wilcoxon p = 0.047) and Amiens 1 (Wilcoxon p = 0.031) for 2011. For the 2016 data, however, no significant recent bottlenecks were detected. As expected considering the   low overall genetic diversity of the species, the assessed effective population sizes were relatively small, and much smaller than the actual population sizes (average: 39.8, range: 4.2-162.2) ( Table 1).

Genetic differentiation
Genetic differentiation was significant among all pairwise populations for each sampling year, except between population Ghent 1 and population Ghent 2 (see Additional files 1, 2, 3 and 4). Genetic differentiation was furthermore significant between both sampling years for The analysis of molecular variance (AMOVA) indicated that molecular variance was significantly partitioned across all tested hierarchic levels for both sampling years (p < 0.001) ( Table 3). Although the percentage of molecular variance among study regions remained constant (26%), the percentages declined among populations (21 to 19%) and among individuals (16 to 12%), with a proportional increase in the percentage of molecular variance within individuals (37 to 43%) ( Table 3).
Results of the principle coordinate analysis showed relatively large changes in genetic makeup for several populations between the 2011 and 2016 sampling year (Fig. 3). The repeated measures ANOVAs showed that part of this change was related to initial population size. More specifically changes on PCoA axis 2 between both sampling years were strongest for small populations (

Discussion
The overall constant genetic diversity in both sampling years, slightly increased polymorphism and reduced among-population genetic differentiation, strongly suggest that population re-establishment is effectuated by relatively high numbers of colonists and migrants, thus stabilizing both genetic diversity and among-population genetic differentiation across generations. These results are in agreement with observed temporal genetic patterns in two short-lived invasive alien insects after tens of generations [18,19]. In other words, despite the initially low genetic diversity and associated low effective population sizes, I. glandulifera seems to be successful at persisting across its invaded range seemingly due to surprisingly high levels of gene flow. Our results thus support the theoretical 'migrant pool' colonization model, rather than the 'propagule pool' colonization model of annual population re-establishment between 2011 and 2015 [23,24].
Although no evidence was found for genetic bottlenecks in 2016 and slight increases in genetic diversity were evident, substantial shifts in the genetic makeup did occur for most populations. This is in agreement with the observed temporal genetic differentiation for two I. glandulifera populations after two generations in the UK [17]. These results suggest that, although relatively stable in genetic diversity, these populations do experience significant founder effects and potential drift in their genetic makeup (allele frequencies) during sequential population re-establishment [17].
As previously observed, overall population genetic diversity was low across the invaded European range of I. glandulifera in 2011 [7], which is in line with the often observed low genetic diversity and heterozygosity for invasive alien species in their invaded range [3,6,7]. We furthermore detected significant inbreeding coefficients for seven of the 13 studied populations in 2011, contradictory to the results of Hagenblad et al. [7]. Note that these differences in F IS between both studies are likely caused by the different number of studied individuals  [7]) and the different number of studied populations (13 vs. 10). However, high inbreeding coefficients (> 0.20) have previously been observed for this species in different regions of both its invaded and native range [17,35]. In our study, high F IS values are likely caused by selfing and biparental inbreeding, as a direct consequence of the overall low genetic diversity. However, a Wahlund effect might also be partly responsible for the high F IS values, if subpopulation structure arises due to population re-establishment of a mixture of natural founders and recent colonists [17,36]. Interestingly, the retention of genetic diversity during sequential population re-establishment is not solely driven by high local seed colonization. Indeed, the significant decrease in overall genetic differentiation, decreased among-population molecular variance (F SR ) and increased polymorphism/within-individual molecular variance (F IS ) all suggests that gene flow among (local) populations is significantly shaping genetic patterns of these populations. This illustrates how I. glandulifera's potential for within-study region, long-distance (mainly hydrochorous) seed dispersal can contribute to this species' temporal genetic patterns.
Not surprisingly considering the geographical distances, no indication for among-region gene flow was observed. The among-region molecular variance (F RT ) has remained constant at the relatively high 26% level between both sampling years, also observed in the relatively high, average (among-region) genetic differentiation levels (F ST = 0.218). This furthermore helps to explain why local (within-region) population genetic diversity has remained so low, despite the presence of among-population gene flow and strong differences in allele frequencies and identities among study regions [7]. We can, however, expect that this pattern is temporary. Indeed, over longer time scales, long-distance gene flow will very likely result in the mixing of genetic material of the different areas along the invaded range, thus gradually increasing overall genetic diversity and potentially fitness of local populations [13]. This scenario is equivalent to the sequential introduction events with subsequent gene pool mixing that has been observed for several invasive alien species [6,10,14]. Also note that the occurrence of populations with high genetic diversity in different parts of I. glandulifera's invaded range, such as Finland [35] and Lithuania [37], might be partly caused by such gene flow and subsequent gene pool mixing events. Especially the mixing of the genetically very dissimilar Stockholm populations [7], with the more southern populations could result in strong increases in local population genetic diversity.
Despite the retention of genetic diversity at the population level, all but one of the resampled populations decreased in size between 2011 and 2016. Although part of this decline might be due to local eradication actions, this reduction could also reflect temporary population size fluctuations, incidentally due to suboptimal weather  To really assess if there is a consistent temporal trend toward population size reduction across the gradient, population sizes should be assessed consecutively in the following years. Initial population sizes affected several patterns, although effects were mainly caused by one outlier (the single small population 'Trondheim 2' in our dataset). Our results showed that initially small populations were characterized by the largest shifts, both in genetic composition and in F IS , possibly caused by the combined actions of increased genetic drift and higher chances of founder effects in small populations. This likely illustrates the importance of large population sizes and associated high seed production to overcome deleterious effects of sequential population re-establishment in this species [23,38]. These effects of population size only seem to become important below a certain population size threshold however, since the reduction in population size between 2011 and 2016 for most populations did not affect any of the genetic patterns. Alternatively, population size effects might be partly masked by potential substantial annual population size fluctuations during the last 5 years. Indeed, this species is known to occasionally exhibit large fluctuations in annual population sizes [17,29]. Additionally, although no persistent seed bank exists, research has shown that, at least some seeds can persist up to 2 years in the soil [25]. Consequently, germination of these older seeds during population re-establishment, can likely moderately buffer the deleterious genetic effects of large population size fluctuations. Evaluation of temporal genetic patterns for additional, initially small populations could assess the validity of our current results regarding the importance of population size. Effective population sizes were nonetheless extremely small for all populations, suggesting that population re-establishment is likely occurring through many seeds originating from only a limited number of (genetically) different plant individuals, which is not surprising considering the high fecundity of most I. glandulifera plants [25].

Conclusions
In sum, we observed a small temporal increase in genetic diversity and decrease in among-population genetic differentiation between 2011 and 2016, for several I. glandulifera populations across Europe, despite a seemingly overall decrease in their population sizes. These results suggest that annual population re-establishment is following the 'migrant pool' colonization model [23], thus preventing the erosion of local genetic diversity and inflation of among-population genetic differentiation through the combined action of genetic bottlenecks and drift [9,24]. Our results do nonetheless suggest moderate founder effects concerning population genetic composition (allele frequencies), especially for smaller populations, which is in agreement with the results of Walker et al. [17].
Our study furthermore suggests that I. glandulifera experiences significant among-population gene flow, gradually resulting in higher genetic diversity and lower overall genetic differentiation. Despite the initially low genetic diversity and associated low effective population sizes, this species seems to be successful at persisting  Table 1. The first three PCoA axes explained 35.29, 17.16 and 13.81% of the total variation, respectively across its invaded range, and will likely continue to build up higher genetic diversity at the local scale, potentially further enhancing its success. These results suggest that it is very unlikely that this species will show boom-bust dynamics on the longer run, despite its tendency for strong population size fluctuations [29]. In other words, if the species is to be removed from its invaded range, this will have to be effectuated through active eradication measures, since genetically-driven local extinctions are unlikely. The results furthermore suggest that long-term fitness and adaptive potential of this species will likely continue to rise across the invasive range, due to slow but gradual increase in local genetic diversity. This could result in more pronounced population persistence and potential expansion of its current invaded range across Europe.

Sampling and laboratory procedures
In 2011, six Impatiens glandulifera populations with at least 30 flowering individuals were selected for each of six study regions along a 1600 km latitudinal European gradient, ranging from Amiens (France) in the south to Trondheim (Norway) in the north (for more information see: [7,30]) ( Table 1). Each population was defined as a single continuous patch of I. glandulifera individuals. Populations were mainly located in wet areas, in forests or on forest edges, often in the vicinity of waterways. Within each municipality, all six populations were sampled with a minimum distance of 1.8 km between each population. Leaf material was collected from 30 random individuals for each population in 2011 and stored after 24 h drying at 45°C. All populations were revisited in 2016 with the help of GPS coordinates and new leaf samples were collected for 30 random individuals and dried using silica gel. Conform the 2011 sampling campaign, 2016 sampling was performed according to national legislations [7]. Population size was assessed during both sampling years using six ordinal levels; 1. < 50 individuals; 2. 50-100 ind.; 3. 100-200 ind.; 4. 200-500 ind.; 5. 500-1000 ind.; 6. > 1000 ind. More specifically, we counted up to 100-200 individual plants, and consequently used the patch size of this counted part to visually assess the approximate number of individuals of the whole population. The change in population size between both sampling years was calculated as the difference between the 2011 and 2016 population size ordinal level. Note that sampling and population size estimations for 2016 were performed in collaboration with the original collectors, using the detailed written protocols from the 2011 sampling campaign.
We used the I. glandulifera individuals of the ten European populations that had been microsatellite genotyped in the study of Hagenblad et al. [7], named  (Table 1). We genotyped an additional three 2011 populations (Ghent 2, Bremen 2 and Stockholm 2, Table 1), using stored dried leaf samples, resulting in a total of two genotyped populations for each study region, except the Trondheim region, where three populations were genotyped ( Table 1). The same 13 populations were genotyped for the 2016 samples. Due to logistic constraints only 23 randomly selected individuals of the collected 30 were genotyped for each population. Consequently, 23 individuals were also randomly selected for each population from the original Hagenblad et al. [7] dataset. This setup resulted in a total of 598 genotyped individuals across 13 populations and two time-points (2011 and 2016).
We used E.Z.N.A HP plant DNA mini kits for leaf DNA extraction (Omega Bio-tek Inc., GA, USA). We amplified nine microsatellites previously used for the I. glandulifera samples from 2011 [7]. Six of these microsatellites were developed by Provan et al. [39]  Initial comparison of the allele identities and frequencies of the newly genotyped individuals, with those of the genotypes obtained by Hagenblad et al. [7], suggested a consistent allele shift between the datasets, potentially due to the use of a different size standard (GeneScan 500 LIZ vs. GeneScan 600 LIZ respectively) [40]. Ten individuals of the Hagenblad et al. [7] dataset, selected to contain 85% of the observed alleles, were subsequently reanalyzed with the described PCR protocol using the original DNA extracts. This data indeed showed a consistent allele shift of two base pairs across all tested alleles, and was subsequently used to calibrate all genotype data to the original Hagenblad et al. [7] standardized allele identities [40]. Twenty individuals of the 2016 sampling year were furthermore genotyped twice, with an overall reproducibility of 98% of the genotypic allele patterns.

Data analysis Genetic diversity
We used Micro-Checker to assess potential problems with scoring errors due to null alleles, stutter bands or large allele dropout [41]. Although no stutter bands and large allele dropouts were observed, Micro-Checker did indicate a homozygote excess (potential null alleles) for six different loci in at least one, and up to eight, of the 13 populations in at least one of the two time-points. However, considering the overall low genetic diversity, we believe that these patterns are likely not caused by null alleles, except for marker IGNSSR101 & A2, which failed to amplify for all individuals of, at least, one population (populations Trondheim 1 and Trondheim 2, respectively). This is further supported by the observation of high estimated null allele frequencies (> 30%) for IGNSSR101 & A2, but not the other markers, following the Expectation Maximization algorithm for null allele frequency estimation [42] with the FreeNA software [43]. Both markers were nonetheless included for the genetic diversity measures, since exclusion resulted in comparable results for genetic diversity (results not shown).
We calculated the mean number of alleles (further referred to as "A"), observed heterozygosity (H O ), expected heterozygosity (H E ) and polymorphism (%P, the percentage of polymorphic loci across all loci) for each population using GenAlEx 6.503 [44]. The inbreeding coefficient (F IS ) was estimated for all populations based on Weir & Cockerham's F-statistics [45] and significance levels were inferred using 9999 permutations of alleles among individuals within populations with FSTAT 2.9.3 [46].
To test for changes in genetic diversity between the two sampling years we used repeated measures ANOVA (Pillai's Trace test) with sampling year as repeated measure factor and 2011 population size, change in population size between 2011 and 2016 and their interaction as covariates using A, H O , H E , F IS and %P as dependent variables (SPSS Statistics 21.0). Final models were obtained using backward model selection on the covariates based on p-values.

Genetic bottlenecks and effective population size
We used the Bottleneck software to test for potential recent bottleneck events in each population at both sampling years, using the two-phase model of mutation (TPM) with a 90% stepwise component [47]. This technique tests for bottleneck events by looking for evidence of excess heterozygosity relative to allele numbers [47]. Effective population size (N e ) was assessed using the temporal method, for which the genetic composition of each population across the two sampling years (five generations) is used to estimate N e . More specifically we used the method of Jorde and Ryman [48], which is considered more appropriate for small sample sizes and skewed allele frequencies compared to more classical methods which often overestimate N e . N e estimations were furthermore based on plan I sampling (non-destructive sampling), with a 0.02 critical value (frequency) for rare allele exclusion, using the NeEstimator v2 software [49].

Genetic differentiation
We calculated pairwise genetic differentiation among populations for both sampling years separately, based on Wright's F-statistics (F ST ). Additionally, genetic differentiation was assessed between the two sampling years for each population. We additionally calculated Hedrick's G' ST and Jost's D as measures of genetic differentiation, since, unlike F ST , these measures are not affected by marker variability [50]. G' ST is the original G ST as defined by Nei [51] standardized by its maximum value [52]. Jost's D is based on the effective number of alleles rather than on heterozygosity [53]. We used GenAlEx 6.503 for calculation and significance testing (9999 permutations) of all pairwise genetic differentiation metrics [44]. Since two microsatellite markers (IGNSSR101 & A2) failed to amplify for one population, both were excluded for the calculation of all genetic differentiation measures. Additionally, we calculated pairwise F ST values corrected for null-alleles using the ENA (excluding null alleles) correction method with the FreeNA software [43].
We used paired t-tests to compare pairwise among population genetic differentiation (F ST , G' ST , Jost's D and null-allele corrected F ST ) between the 2011 and 2016 populations. Significance of these paired t-tests was assessed based on 9999 bootstraps, to overcome issues with the pairwise dependency of the data (SPSS Statistics 21.0). We performed a hierarchical analysis of molecular variance (AMOVA) on pairwise F ST values (9999 permutations) with GenAlEx 6.503 [44], for each sampling year (2011 and 2016) separately. AMOVA portioned the total genetic diversity among the six study regions (among-regions), among populations within regions and among individuals within populations.
Genetic differentiation between populations was furthermore visualized using a covariance-based principal coordinates analysis (PCoA) based on the standardized F ST -matrix. To test for systematic changes in population-level genetic composition between the two sampling years we used the previously described repeated measures ANOVA design with the plot location on each of the first three PCoA axes as dependent variables (SPSS Statistics 21.0). In these models, sampling year was included as a repeated measure factor and 2011 population size and change in population size between 2011 and 2016 as covariates. A similar PCoA and subsequent repeated measures ANOVA was subsequently performed on the null-allele corrected F ST values.