Genetic diversity of natural orchardgrass (Dactylis glomerata L.) populations in three regions in Europe

Background Dactylis glomerata (orchardgrass or cocksfoot) is a forage crop of agronomic importance comprising high phenotypic plasticity and variability. Although the genus Dactylis has been studied quite well within the past century, little is known about the genetic diversity and population patterns of natural populations from geographically distinct grassland regions in Europe. The objectives of this study were to test the ploidy level of 59 natural and semi-natural populations of D. glomerata, to investigate genetic diversity, differentiation patterns within and among the three geographic regions, and to evaluate selected populations for their value as genetic resources. Results Among 1861 plants from 20 Swiss, 20 Bulgarian and 19 Norwegian populations of D. glomerata, exclusively tetraploid individuals were identified based on 29 SSR markers. The average expected heterozygosity (HE,C) ranged from 0.44 to 0.59 and was highest in the Norwegian region. The total number of rare alleles was high, accounting for 59.9% of the amplified alleles. 80.82% of the investigated individuals could be assigned to their respective geographic region based on allele frequencies. Average genetic distances were low despite large geographic distances and ranged from D = 0.09 to 0.29 among populations. Conclusions All three case study regions revealed high genetic variability of tetraploid D. glomerata within selected populations and numerous rare and localized alleles which were geographically unique. The large, permanent grassland patches in Bulgaria provided a high genetic diversity, while fragmented, semi-natural grassland in the Norwegian region provided a high amount of rare, localized alleles, which have to be considered in conservation and breeding strategies. Therefore, the selected grassland populations investigated conserve a large pool of genetic resources and provide valuable sources for forage crop breeding programs.


Background
Dactylis glomerata L. (orchardgrass or cocksfoot), a long-lived and perennial grassland species is the fourth most important forage grass in the world [1]. Its economic value is based on its high productivity and its disease resistance under varying climatic conditions [2]. Due to its high forage quality, i.e. sugar and protein contents, shade tolerance and persistence; the species D. glomerata is used for hay or silage production and grazing worldwide. Continuous outcrossing by wind-pollination, natural selection and adaptation processes have resulted in a wide geographic range and large morphological variability [3]. D. glomerata has a genome size of 4312 Mbp and comprises diploid (2n = 2× = 14), tetraploid (2n = 4× = 28) and hexaploid (2n = 6× = 42) accessions [1]. Polyploidy in this complex is known to result from autopolyploidy due to polysomic inheritance [4], which can reduce the loss of genetic variation within populations [5]. Within natural populations and among the more than 200 cultivars currently available, tetraploid D. glomerata are the most widespread [1]. However, diploid and tetraploid populations have been recorded living in sympatry, e.g. on the Iberian peninsula [6]. Sympatric appearance can either result from habitat changes leading to the intermixing of diploid and tetraploid populations, the formation of autotetraploids or hybridization among individuals of different ploidy levels [7]. Compared to tetraploid populations, hexaploid populations of D. glomerata are rare and restricted to certain areas, e.g. Libya, Egypt or Spain [8]. Polyploid populations are of major importance in nature. They have evolutionary benefits due to their increased heterozygosity and decreased inbreeding depression. They are able to more easily colonize new niches and capable of coping with changing ecological conditions on a broad geographical range [9]. Tetraploid individuals are also characterized by a great genetic variability and an increased cell, ligule and plant size [10].
The high plasticity and heterogeneity of the genome of D. glomerata has led to a widespread occurrence in natural and semi-natural grassland across Europe. Natural populations of D. glomerata are of major importance for forage crop breeding. In natural and semi-natural grasslands, those populations harbor high genetic diversity, which provides advantages for future breeding and conservation programs in particular with respect to climatic changes and an increasing demand for forage and food production [11]. Detailed information on genetic diversity of natural populations of D. glomerata, which could be sources for genetically diverse material, is rather scarce. Investigation of genetic patterns in natural or semi-natural grassland populations may not only reveal fundamental knowledge on population genetic structures, but may also support the evaluation and utilization of natural resources with respect to forage crop improvement and in situ conservation. Our recent investigations in Switzerland suggest high genetic diversity within, but low genetic variability among populations in permanent grassland (Last et al., submitted).
Geographically distinct populations can differ in their level of genetic diversity or in the distribution of diversity within and among regions [12]. The value of separated geographic regions for forage crop improvement arises from the limited gene flow among those populations and their independent developement under different conditions [13]. These distinct sites are differentiated by various environmental factors such as soil conditions, average temperature or day length and may contain populations harboring valuable traits or alleles that could be used in future breeding programs [14]. This diversity and variability from different geographic regions could be used for in situ protection of forage crop genotypes and populations from genetic erosion and provide new germplasm for forage crop breeding.
Simple Sequence Repeats (SSRs) are genetic markers consisting of one to six nucleotides occurring in a repeated pattern (tandem repeats). Their high abundance across the genome, neutral and co-dominant inheritance, and highly polymorphic character qualify SSRs as multiallelic genetic markers for a broad range of applications, e.g., in breeding and crop improvement as well as in population and ecological genetics (summarized by Kalia et al. [15]).
The aim of this study was to investigate the population structure and genetic variability of natural and seminatural D. glomerata populations in Bulgaria, Norway and Switzerland, representing three grassland regions in Europe. The objectives were: (1) to investigate the ploidy level of D. glomerata individuals in three selected grassland regions, (2) to study the genetic diversity and patterns of differentiation within and among populations from different geographical regions, (3) to evaluate the use of geographically distinct regions for the in situ conservation of genetic resources of the grass species D. glomerata.

Sampling sites and plant material
Sampling sites were located in grassland regions of three European countries. The Bulgarian region (BG) was located in the Smoljan region in the Rhodope Mountains of South Central Bulgaria. At altitudes ranging from 900 to 1400 m.a.s.l, the 20 selected sampling sites were distributed across an area of 3193 km 2 , with distances of 0.18 to 47.36 km between sites (Ø = 1.34 km between sampling sites on farm, Ø = 23.59 km between sites on different farms). Management was characterized by lowinput farming of permanent grassland for cattle and sheep based dairy production. The Swiss region (CH) was located in the canton Obwalden, in the Northern Swiss Alps. Ranging from 600 to 1100 m.a.s.l, the 20 selected sampling sites were distributed across an area of 12 km 2 , with distances of 0.09 to 6.03 km between sites (Ø = 0.44 km between sampling sites on farm, Ø = 1.97 km between sites on different farms). Farms were dominated by natural, permanent grassland for cattlebased dairy production (Last et al., submitted). The Norwegian region (NO), located in Nord-Østerdal in the north of Hedmark County, covered an area of 4871 km 2 , ranging from 500 to 1600 m.a.s.l.. The 19 sampling sites, which had not been re-sown for at least 6 years, were located at distances of 0.06 to 46.69 km from one another (Ø = 4.19 km between sampling sites on farm, Ø = 19.35 km between sites on different farms). Management was characterized by sheep raising and hay production. Seed mixtures applied in the Norwegian sampling sites did not contain D. glomerata cultivars and occurring D. glomerata were considered natural populations. On-farm interviews and questionnaires were used to obtain information about farming systems, on-farm production, management and the potential application of commercial seed mixtures.
Fresh leaf tissue of plant tillers was sampled from randomly selected D. glomerata plants from a total of 59 sampling sites (Table 1) during spring and summer 2010. With few exceptions, each population at one sampling site was represented by 32 individuals separated by a distance of at least one meter ( Table 1). The collected plant material from each individual was immediately placed in a 15 ml plastic tube half-filled with silica gel, where it was left to dry until DNA extraction.

DNA extraction and SSR analysis
Dried plant tissue (30 mg) was ground three times by metal beads at 30 hz using a mill (Retch GmbH, Haan, Germany). DNA was extracted using the NucleoSpin® 96 Plant II (Marchery-Nagel, Düren, Germany) extraction kit. Quantity and quality of DNA was assessed by photospectrometry using NanoDrop (Thermo Fisher Scientific, Wilmington, DE, USA) and the ND-1000 software. SSR marker analysis was performed in multiplex reactions using 29 primer pairs ( All binned peaks were checked for correct assignment to corresponding bands and corrected manually. Samples were randomly arranged for PCR and fragment analysis.

Statistical analysis
The ploidy level of collected individuals was calculated based on the maximum and mean number of alleles per locus, for all samples and across all loci using the R package "polysat" [18,19]. As used in the study of Aerts et al. [20] and also proposed by Palop-Esteban et al. [21], the R package "polysat" provided a useful statistical tool to handle microsatellite data while considering tetraploidy within populations. The total number of alleles per locus and the polymorphic information content (PIC) were calculated for each primer (Table 2). Genetic diversity of D. glomerata within populations was estimated using the unbiased measurement of average expected heterozygosity corrected for sample size H E,C [22] and allelic richness (A, mean number of alleles per locus) per population (Table 1). H E,C was calculated based on the tetraploid data set using the ATETRA program 1.3a [23]. The total number of rare alleles, defined as alleles with a frequency < 0.05 per locus were The number of individuals (n), geographical coordinates, expected heterozygosity (H E,C ), allelic richness (A), the total number of rare alleles and the mean number of rare alleles per locus are given per population. Twenty populations originated from Switzerland (CH), 20 from Bulgaria (BG) and 19 from Norway (NO).
calculated, and rare alleles were classified in allele categories (Table 3) as proposed by Brown [24]. A multiple comparison of diversity indices among regions was conducted using the Tukey's HSD (honestly significant difference) test. Genetic structure and variation among regions, among populations and among individuals was assessed by Analysis of Molecular Variance (AMOVA) using the R package "vegan" [25,26]. AMOVA was based on pairwise Euclidean distance using binary SSR allele data. Genetic differentiation between populations was addressed by the calculation of pairwise genetic distance D [27] based on allele frequencies using the R packages "polysat" and "adegenet" [18,28]. Mantel test was performed to test for correlation among matrices of pairwise genetic distance (D) and the respective geographic distance (km) in order to test for isolation-by-distance (IBD) applying the Isolation by Distance Web Service Version 3.23 (IBDWS, http://ibdws.sdsu.edu/, [29]). Significance of correlation was tested by 999 random permutations. Population genetic variation was investigated by principal component analysis (PCA) based on a binary data matrix (present/absent) derived from SSR alleles. Spatial population structure and membership of individuals to populations were investigated based on the binary data set using the model-based clustering method implemented in the STRUCTURE program version 2.3.1 [30]. The optimum number of subpopulations (K) among and within regions was calculated based on six independently repeated runs of 100000 iterations (length of burn-in period) followed by 100000 Markov Chain Monte Carlo (MCMC) repetitions after burn-in applying the implemented admixture model and correlated frequencies [31]. For the estimation of subpopulations among regions, K was set from 1 to 8. Within regions, K was set from 1 to 20, (19 in the Norwegian region). The K-value revealing the highest maximum likelihood 'Ln P(D)' after several independent runs was selected for the assignment of individuals to subpopulations based on their membership probability. Populations in which all individuals had membership probability of ≥ 0.8 were regarded as distinct populations, whereas populations containing individuals with membership probability < 0.8 were considered as admixed [32]. In order to keep the high number of polymorphic loci and in order to consider also rare alleles for the data analysis, we decided to apply the commonly used and more conservative approach based on binary allele scoring for AMOVA, isolation by distance, PCA analysis and the analysis by STRUCTURE. This approach is generally accepted to investigate synthetic and natural populations [20,[33][34][35].

Ploidy level and genetic diversity within populations
The maximum and mean number of alleles per locus across all loci was ≥ 3 for each sample. Consequently, at least one out of 29 loci per individual revealed 3 to 4 alleles per locus, which indicated tetraploidy of the corresponding individual (data not shown). Among 1861 D. glomerata plants, the 29 SSR primers detected 257 polymorphic alleles, varying in size from 73 to 260 bp ( Table 2). The polymorphic information content (PIC) varied considerably, ranging from 0.12 to 0.88 (mean: 0.62 ± 0.18) ( Table 2). The average expected heterozygosity (H E,C ) across all loci was high in all regions, ranging from 0.44 to 0.59 ( Table 1). The greatest variation in H E,C was detected in Bulgaria (Figure 1a). The mean H E,C was significantly higher in the Norwegian region (H E,C = 0.54) when compared to the Bulgarian region (H E,C = 0.52, P < 0.05). There was no significant difference between Switzerland (H E,C = 0.53) and the two other regions (Figure 1a). The total number of rare alleles (frequency < 5%) across all loci was 154 and covered 59.9% of all amplified allelic bands across all the three regions. 103 of the amplified alleles were classified as common with an occurrence larger than 5% and in more Correspondences to categories were based on the percentage of occurrence of alleles above (common allele) and below (rare allele) the five per cent mark, as well as the number of populations (locations) in which the allele was detected within the Swiss (CH), Norwegian (NO) and Bulgarian (BG) region.
Switzerland Bulgaria Norway than two locations. Not all alleles were detectable in all of the regions. Two to seven unique alleles within one region were detected (Table 3). The Bulgarian region had the greatest total number of alleles (241) ( Table 3). The mean number of rare alleles per population within the Swiss region was significantly higher than within the Norwegian region (Tukey HDS, P < 0.05) (Figure 1b). Within the Swiss region, the highest mean number of alleles and rare alleles per locus per population (5.30, 1.85) was detected, which was significantly different from Norway (Tukey HDS, P < 0.05), but not Bulgaria (Table 3). Hierarchical analysis of molecular variance (AMOVA) across individuals from all three regions revealed most of the genetic diversity to be due to variation within populations (86.43%), while the variation among regions (6.46%), among farms within regions (3.88%) and among populations within farms (3.22%), was small but significant (Table 4). This pattern of within and among population partitioning of genetic variation was representative for D. glomerata populations in all three regions (Table 4).

Genetic distances among populations
Pairwise genetic distances were low to moderate for all pairs of regions, ranging from D = 0.03 (CH-NO) and D = 0.06 (NO-BG), to D = 0.09 (CH-BG). Genetic distances among populations within regions ranged from D = 0.01 to 0.02 (CH), D = 0.009 to 0.05 (NO) and D = 0.01 to 0.21 (BG). The greatest genetic distance between populations from different regions was D = 0.29 for the Swiss population CH04 and the Bulgarian population BG04. Significant correlations between pairwise genetic distances (D) and the corresponding geographical distances between populations within the Norwegian region (r M = 0.37, P = 0.01) and among the three regions (r M = 0.39, P < 0.001) were identified by testing for isolation by distance (Figure 2).

Population structure
A moderate but clear separation of genotypes among regions was revealed by principle component analysis (PCA) based on 1861 individuals and 257 SSR alleles ( Figure 3). The first two principle components (PCs) explained 10.97% of the total molecular variation among samples, while the third PC explained less than 2%. For D. glomerata from different regions, the number of populations K = 3 revealed greater variability of maximum likelihood (Ln P(D)) among different tested K values than among repeated runs and was considered as the optimal number of populations (Figure 4a). In total, 1504 of 1861 individuals were assigned to one of the three populations due to their membership probability ≥ 0.8. The proportion of membership in each pre-defined cluster (Bulgaria, Switzerland and Norway) was greatest in Switzerland (91.2%), followed by Norway (84.5%). Only 62.2% of the individuals from Bulgaria were assigned exclusively to the corresponding cluster ( Figure 5). For population structures within regions, no definite number of populations could be defined based on selected numbers of K (Figure 4b -d).

Ploidy level of D. glomerata populations
This study revealed exclusively tetraploid individuals of D. glomerata sampled from 59 natural and semi-natural populations in three distinct regions of Europe. Although tetraploid and diploid populations can occur in sympatry [6,7], autotetraploid individuals of D. glomerata have been reported to be most abundant in cultivars and natural populations [36], which was clearly supported by this study. This provided first information about the value of selected populations considering polyploidy as an important factor for forage crop breeding. These tetraploid populations could be potential sources increasing forage quality and yield, which is often related to tetra-ploidy in forage grasses. Furthermore, these exclusively tetraploid populations usually contain higher Analysis was performed using a binary data set based on 29 SSR markers which generated 257 SSR bands. All variance components were significant (P < 0.001) based on 999 permutations.
genetic diversity compared to diploid populations as has been shown for Rorippa amphibian or Bromus species in European populations [37,38]. Both, tetraploidy and the corresponding high degree of genetic diversity identified within selected populations of D. glomerata indicated them to be valuable sources for germplasm collections.

Genetic and allelic diversity within populations
The genetic diversity in terms of average expected heterozygosity H E,C of D. glomerata within populations from different grassland regions in Europe was comparable with D. glomerata cultivars [39]. The genetic diversity of populations is a major capacity for the adaptation to various and changing environmental conditions [40]. However, this study showed that most of the genetic variation was detected within, rather than between populations, as it has been demonstrated for other agriculturally important grass species such as Poa alpina, Festuca pratensis or Lolium multiflorum [32,41]. Similarly, the variation detected within geographic regions was larger than the variation between them, which is congruent with studies on the germplasm of Lolium perenne from different geographical regions worldwide [12]. High genetic diversity within D. glomerata populations strongly depends on various life history traits, such as the outbreeding mating system and efficient pollen dispersal by wind [42]. On average, the highest H E,C and the lowest mean number of alleles per locus per population within selected populations was detected in the Norwegian region. Since the semi-natural grassland populations did not receive any seed mixtures that included D. glomerata varieties, genotypes from natural and commercial gene pools must have immigrated from outside populations [43]. In the Norwegian region, H E,C was high for all populations. As revealed by previous studies on grassland genetic diversity in space and time, habitat age, connectivity and past use in a landscape and historical context have a major impact on current genetic diversity patterns [44]. In the 1950s, the area in Nord-Østerdal was much more open resulting in an high gene flow among connected grassland -hence the low range of H E,C [45]. When farming declined in the area, establishing forests disconnected populations and interrupted gene flow among populations [46]. Relatively recent mutations within D. glomerata populations could then explain these fragmented grassland patches and the relatively high number of rare and localized alleles. Whereas in the Bulgarian region, some populations of D. glomerata revealed high H E,C and others indicated low H E,C -values, lowering the average H E,C across all population, but augmenting the range of H E within the region. There, the landscape provided some large grassland patches with high levels of gene flow and high H E,C as well as single, remote grassland patches with low gene flow due to low connectivity. A lower gene flow into more isolated populations has been revealed for grassland species such as Globularia bisnagarica [47]. Therefore, in the Norwegian region, collections could quickly capture the genetic variation (except for those rare, localized genes). In Bulgaria more individuals would have to be sampled, but would eventually provide a higher total genetic diversity. Presumably, the Bulgarian populations could be more resilient to environmental changes, because some individuals might have a favorable genotype, whilst in the Norwegian region there is little difference between individuals and, therefore, less possibility for adaptation [48]. The Swiss populations revealed the same small range of H E,C and a lower average H E,C across all investigated populations as the Norwegian region. Here, permanent grassland has been established for a long time without sod disturbances, e.g. rotational forage crops, or the introduction of new genetic material, e.g. by re-sowing or extended seed-recruitment. Furthermore, the selected populations were located in a small geographic range (Last et al., submitted) increasing the connectivity and gene flow among populations leading to constant intermixing and, therefore, high H E,C in all D. glomerata populations.
Although, H E,C represents a common measurement for genetic diversity based on allele frequencies, allelic diversity or allelic richness plays a more relevant role for genetic conservation [49]. The presence of many rare alleles and, especially, alleles that were detected in only single populations or regions indicated the potential value of every single population as a genetic resource. A comparable number of rare alleles within grassland species has been detected for F. pratensis in Swiss ecotype populations by using SSR markers [32]. In the Norwegian region, grassland sampling sites were fragmented by forests which represents a common landscape structure within this area. Within these fragments, the high selection pressure of fragmentation resulted in increasing genetic differentiation and the loss of rare alleles on the long run [50,51]. The consideration of those natural populations for in situ conservation and germplasm collections might comprise the potential to increase the quality of grassland cultivars in terms of resilience and persistence in currently unfavorable areas [52,53].

Genetic diversity among populations
Although the genetic diversity was high in selected populations and regions, the genetic distance of individuals among populations was low and did not indicate clear distinction of selected populations within regions. These results support previous studies on D. glomerata and L. multiflorum populations, which investigated populations less than 100 km apart [32,36] (Last et al., submitted). A high degree of gene flow is very common in selfincompatible and wind-pollinating grass species, leading to low genetic distance among individuals and populations [24]. The high abundance of individuals per species may increase gene flow within study sites as revealed for F. pratensis [54]. The impact of differentiated evolutionary processes affecting the genetic structure of distinct grassland populations increases with increasing genetic distance due to lacking structural and functional connectivity among populations [46,55]. No isolation by distance was detected in the Swiss region. There, the selected populations originated from a small geographic range with small distances between populations. However, isolation by distance occurred within the Norwegian region where distances among selected populations were high and collection sites scattered on a large geographic area. It may be the fragmentation of the grassland patches and disconnection by landscape  change that has led to gene flow restriction, as it has been revealed for D. glomerata populations in Turkey [36,44]. In contrast to the Norwegian region, no isolation by distance was detected among the Bulgaria populations. Although this region was of large geographic range, grassland patches (farmland patches) belonging to a single farm were less scattered. The genetic distance that we found among populations from distinct and distant regions in Europe was slightly higher than the genetic distance between populations within regions, and reflects results from population genetic studies among natural populations and cultivars of D. glomerata worldwide [56]. The increase of genetic distance among population of D. glomerata between distant regions reflects what has been found in previous studies on isolation-by-distance patterns of Festuca arundinacea investigated within large geographic ranges [57]. Although self-incompatible and wind-pollinating species are expected to reach the highest rate of gene flow among individuals and populations, the distance of pollen distribution is restricted and does very rarely reach long-distance transport [54,58]. Although there was a clear separation of populations from distinct geographical regions, the probability of an individual belonging to one of the regional subgroups was less than 80% for some genotypes. This indicated an admixture of genetic information among regions regardless the large geographic distances among regions. These admixtures could either be explained by the assumption of a common ancestor within the Poaceae family and differentiated selective forces resulting from different environmental and ecological conditions [52]. According to this theory, most of the individuals contain the same genotypic constitution adapted to local or geographical conditions, while only single individuals remain admixed as the common ancestors were. Admixture or admixed genotypes might also result from human-mediated transfer of grasses and their seed material among populations that are geographically apart and genetically distinct. A high agricultural importance, the widespread use of common seed material in the past and the constant outcrossing of natural populations and introduced germplasm can additionally affect genetic diversity patterns today [59].

Conclusions
The investigation of 59 natural and semi-natural populations of D. glomerata, not only revealed exclusively tetraploid individuals, but high genetic diversity, a high number of rare and geographically unique alleles in geographically distinct populations. The three regions revealed genetically distinct patterns and were differentiated from each other. These populations of D. glomerata might contain valuable sources for plants adapted to specific, but differentiated environmental conditions. Especially the high amount of rare, localized alleles in Norway or the high amount of unique alleles located in Bulgaria may indicate valuable sources for breeding material adapted to climatic and environmental changes in certain regions. To conserve a high amount of genetic diversity large, permanent grassland patches with natural populations of D. glomerata, as represented in Bulgaria, should be considered. Fragmented and smaller grassland patches as represented in the Norwegian region on the other hand, can provide a high amount of rare, localized alleles of D. glomerata. In general, genetic material from distinct geographical regions and multiple populations should be considered for ex situ and in situ conservation.