Genetic diversity and population structure analyses of Plectranthus edulis (Vatke) Agnew collections from diverse agro-ecologies in Ethiopia using newly developed EST-SSRs marker system

Background Plectranthus edulis (Vatke) Agnew (locally known as Ethiopian dinich or Ethiopian potato) is one of the most economically important edible tuber crops indigenous to Ethiopia. Evaluating the extent of genetic diversity within and among populations is one of the first and most important steps in breeding and conservation measures. Hence, this study was aimed at evaluating the genetic diversity and population structure of this crop using collections from diverse agro-ecologies in Ethiopia. Results Twenty polymorphic expressed sequence tag based simple sequence repeat (EST-SSRs) markers were developed for P. edulis based on EST sequences of P. barbatus deposited in the GenBank. These markers were used for genetic diversity analyses of 287 individual plants representing 12 populations, and a total of 128 alleles were identified across the entire loci and populations. Different parameters were used to estimate the genetic diversity within populations; and gene diversity index (GD) ranged from 0.31 to 0.39 with overall mean of 0.35. Hierarchical analysis of molecular variance (AMOVA) showed significant but low population differentiation with only 3% of the total variation accounted for variation among populations. Likewise, cluster and STRUCTURE analyses did not group the populations into sharply distinct clusters, which could be attributed to historical and contemporary gene flow and the reproductive biology of the crop. Conclusions These newly developed EST-SSR markers are highly polymorphic within P. edulis and hence are valuable genetic tools that can be used to evaluate the extent of genetic diversity and population structure of not only P. edulis but also various other species within the Lamiaceae family. Among the 12 populations studied, populations collected from Wenbera, Awi and Wolaita showed a higher genetic diversity as compared to other populations, and hence these areas can be considered as hot spots for in-situ conservation as well as for identification of genotypes that can be used in breeding programs. Electronic supplementary material The online version of this article (10.1186/s12863-018-0682-z) contains supplementary material, which is available to authorized users.


Background
Ethiopia is one of the countries in the world that has the greatest crop genetic diversity and considered as a primary gene center for several crops [1][2][3] including edible roots and tubers [4]. However, the food potential of horticultural crops, particularly those of indigenous edible roots and tubers have not been fully exploited despite their significant contributions to the livelihood of subsistence farmers. Such crops are overlooked in terms of research and breeding and their production and management systems are restricted to local farmers' varieties maintained by farmers using local knowledge [5][6][7].
Plectranthus edulis (Vatke) Agnew (Lamiaceae), also locally known as Ethiopian potato syno. Ethiopian dinich, is one of the most economically important indigenous tuber crops [2,8] with wide distribution in parts of Africa, largely in wild form [9]. It produces edible stem tubers on stolon below ground. The crop can thrive and set tubers under significant environmental constraints including degraded and poor soil and can produce reasonable yield without a need for intensive management practices [10].
In Ethiopia, this edible tuber crop is commonly cultivated by smallholder farmers around homesteads, alone or mixed with cereals, fruits and pulses, largely for household consumption and rarely for marketing of its tubers. It was one of the most common Ethiopian staple crops [11] and usually called 'hunger crop' as it fills the food supply gap that occurs from August to November, the period before the harvest of cereal crops. It has also been widely used as a folk medicine in several parts of Africa including Ethiopia [12] and is commonly visited by honeybees for its nectar [13].
Nowadays, local gene pool of the crop is under the threat of severe genetic erosion as it is disappearing from several areas where it used to be widely cultivated and restricted to highly marginalized land and limited to few elderly farmers in some other areas. The lack of improved planting material and limited awareness about the significance of this crop among younger farmers as well as the introduction of other high yielding tuber crops like Irish potato to the area, recurrent drought and environmental degradation have contributed to the genetic erosion of this crop [5,14,15].
Assessing the extent of genetic diversity of crop species is a vitally important step for their effective conservation and improvement. Molecular markers, such as simple sequence repeats (SSRs) are among the most important genetic tools for such purpose [16,17]. SSRs derived from Expressed sequence tags (ESTs) (EST-SSRs) have been widely used in many plant species including root and tuber crops, as they are relatively fast and cost-effective to develop [18], have a well-conserved flanking sequences among phylogenetically closely related species and hence are highly transferable among related taxa, and are less susceptible to null alleles [19]. Such cross-genera and cross-species transferable molecular markers are highly important genomic resources to study plant species with little or no DNA sequence information, such as P. edulis. Until recently, very few transferable EST-SSR markers have been developed within the family Lamiaceae [20,21] and, so far, there is no report of such markers for the genus Plectranthus. Moreover, there is no publication on molecular marker based genetic diversity of P. edulis, which shows that researchers and plant breeders have not given enough attention for conservation and improvement of this crop. Consequently, its production, utilization, and improvement are highly restricted. Hence, this study was initiated with the aim of developing and validating EST-SSR markers for use in population structure and genetic diversity analyses of the crop, which eventually serve as a basis for its improvement and sustainable conservation.

Plant material
A total of 174 tuber samples of cultivated P. edulis, representing 12 populations, were randomly collected from farmers' fields with permission from individual farmers ( Fig. 1, Table 1). The identity of the samples was confirmed based on the species description provided in Flora of Ethiopia and Eritrea [22]. The collected samples were planted during the regular crop growing season (end of April) in 2016 at Holeta Agricultural Research Centre (which is a part of the Ethiopian Institute of Agricultural Research) field site located 40 Km west of Addis Ababa. This field site is located at a geographic position of 09 o 04'N, 38 o 29′E and altitude of 2400 masl.
Leaf sample collection and DNA extraction oung leaf tissue from 287 individual plants (one to three individual plants per tuber sample) ( Table 1, Additional file 1) were separately collected in a ziplock bag, silica gel dried and transported to the Swedish University of Agricultural Sciences (SLU), Alnarp, Sweden. Genomic DNA was extracted from these samples using a modified Cetyl Trimethyl Ammonium Bromide (CTAB) protocol as described in Geleta et al. [23]. The DNA quality was assessed using 1% agarose gel electrophoresis whereas NanoDrop® ND -1000 Spectrophotometer (Saveen Warner, Sweden) was used to determine the quantity of extracted DNA.

Data mining, designing and screening EST-SSR primers
Initially, 3263 P. barbatus EST sequences were retrieved from the National Centre for Biotechnology Information (NCBI public DNA sequence database (https://www.ncbi.nlm.nih.gov/nucest/?term=Plectran thus+barbatus), and sequences containing SSRs were screened using WebSat, a web-based software for  Table 1 for full description of the populations). The map was original and constructed using geographic coordinates and elevation data gathered from each collection sites using global positioning system (GPS) microsatellite marker development [24] (http://purl.oclc.org/NET/websat/). After excluding redundant, overlapping and short sequences, about 300 sequences containing two to six SSR motifs were identified. Of these, 40 sequences were selected for designing primers using Primer3 primer designing program [25,26] (http://bioinfo.ut.ee/primer3-0.4.0/). The 40 newly designed primer-pairs were tested for amplification of their target genomic regions using 36 DNA samples representing the 12 populations of P. edulis. The polymerase chain reaction (PCR) products were electrophoresed on 1.5% agarose gel containing gel-red and visualized using Saveen Werner AB UV camera equipped with SSM930CE Sony Black and white Monitor. The size of amplified products was estimated by loading GeneRuler 50 bp DNA ladder together with the samples on separate lanes. Under optimized PCR conditions, 20 of the 40 primer-pairs consistently amplified their polymorphic loci, and hence were selected for use on all samples included in the present study (Table 2).

Pre-amplification and capillary electrophoresis
For cost effectiveness and improved quality of amplified products, the sequences of the 20 primer-pairs were modified as follows: (1) A 18-bp universal M13 primer sequence (5′-TGTAAAACGACGGCCAGT-3′) was added as a common tail to the 5′-end of all forward primers following Oetting et al. [27]; and (2) a PIG-tail sequence of 5′-GCTTCT-3′ was added to the 5′-end of the reverse primers to prevent non-templated addition of nucleotides to amplified products as described in Brownstein et al. [28]. M13 primer 5′-end labeled with HEX™ or 6-FAM™ fluorophores was used as a third primer in each PCR amplification.
PCR was carried out using 96-well plates with 25 μl reaction volume [1× reaction buffer, 1.5 mM MgCl 2 , 0.3 mM dNTPs, 0.08 μM M13-tailed forward primer, 0.3 μM pig-tailed reverse primer, 0.3 μM 6-FAM or HEX labeled M13 primer, 1 U Dream Taq DNA Polymerase, and 25 ng template DNA]. A mixture of all the components except genomic DNA was included as a negative control. Amplification was performed using GeneAMP PCR 9700 thermocycler (Applied Biosystems Inc. USA) according to the following five-stage PCR protocol: (1) initial 15 min denaturation at 95°C, (2) 35 cycles of 30 s denaturation at 94°C, 30 s annealing at optimized annealing temperature for each primer-pair (see Table 1), and 30 s primer extension at 72°C, (3) eight cycles of 30 s denaturation at 94°C , 45 s annealing at 53°C and 45 s primer extension at 72°C, (4) additional primer extension at 72°C for 10 min, and (5) a final 30 min primer extension at 60°C. The PCR products were stored at 4°C until they were electrophoresed. The PCR products were multiplexed into panels based on fragment sizes of the SSRs and the fluorescence label of the M13 primer and diluted 25× using Millipore water. Finally, 0.7 μl of multiplexed and diluted PCR products, 1.9 μl Hi-Di formamide and 0.3 μl size standard (GenScanTm600 LIZ® size standard) were mixed and ccapillary electrophoresis was conducted using Genetic Analyzer 3500 (Applied Biosystems) at SLU, Department of Plant Breeding, Alnarp, Sweden.

Allele scoring and statistical analysis
Peak identification and fragment sizing were done using GeneMarker V2.6.0 (SoftGenetics®) with default settings and 200 threshold intensities. Then, the allele size (bp) data at each locus were exported to excel for statistical analyses. Locus based diversity indices: major allele frequency (MAF), the number of alleles (NA), gene diversity (GD), and polymorphic information content (PIC) were determined using PowerMarker ver. 3.25 [29] ( Table 3). The number of effective alleles (Ne), observed heterozygosity (Ho), expected heterozygosity (He) [30], Shannon's Information Index (I), and estimate of the deviation from Hardy-Weinberg equilibrium (HWE) over the entire populations and population genetic diversity indices: Ne, percentage of polymorphic loci (PPL), Ho, He, I, Nei's gene diversity over the entire loci were computed using GenAlEx ver. 6.501 [31] (Table 4). To determine the correlation between observed allelic diversity and sample size of populations, rarified allelic richness (Ar) and private rarified allelic richness (Ap) were estimated using HP-Rare 1.1 software [32]. To identify genetic groups, G, the number of distinct multi-locus genotypes (MLGs) present in each sample were evaluated using GenClone 2.0 [33]. To evaluate the global clonality rate of the sample, the index of clonal diversity was computed as G/N, where G is the number of MLGs and N is the total number of genotyped individuals. To analyze the distribution of genetic variation and to estimate the variance components of the populations, analysis of molecular variance (AMOVA) was computed using Arlequin ver. 3.5.2.2 [34]. Population differentiation tests: Wrights fixation index (F ST ) and pairwise F ST were computed using GenAlEx ver. 6.501 [31], and significance was tested based on 1000 bootstraps. Simple matching dissimilarity coefficient-based Neighbor-Joining tree and Nei's standard genetic distance (D ST, corrected) [35] based Unweighted Pair Group Method with Arithmetic Mean (UPGMA) [36] tree was constructed using DARwin var. 6.0.14 [37] and POPTREE2 [38], respectively, and significance was tested based on 1000 bootstraps [39]. The resulting trees were displayed using TreeView (Win 32) 1.6.6 program [40] and FigTree var. 1.4.3 [41]. Gene flow (Nm) among populations was estimated using the formula, Nm = 0.25(1 − Fst)/Fst [42].
A Bayesian model-based clustering algorithm in STRUCTURE ver. 2.3.4 [43,44] was applied to infer the pattern of population structure and detection of admixture. To determine the most likely number of populations (K), a burn-in period of 50,000 was used in each run, and data were collected over 500,000 Markov Chain Monte Carlo (MCMC) replications for K = 1 to K = 12 using 20 iterations for each K. The optimum K value was predicted following the simulation method of Evanno et al. [45] using the web-based STRUCTURE HARVESTER ver. 0.6.92 [46]. Bar plot for the optimum K was determined using Clumpak beta version [47].

Validation of the EST-SSRs marker and evaluation of their levels of polymorphism
The 20 newly developed EST-SSRs markers ( Table 2) are predominantly trinucleotides and dinucleotides. Trinucleotide SSRs accounted for 55% of the loci with the number of repeats ranging from four to ten whereas dinucleotide SSRs accounted for 35% of the loci with the number of repeats ranging from six to ten. The remaining two loci (10%) were pentanucleotide and hexanucleotide repeats with four and five number of repeats, in that order. All the 20 loci were polymorphic and produced a total of 128 alleles (an average of 6.4 alleles per locus) (  and PIC (0.05) were recorded for PE_07. In terms of the overall PIC, one SSR locus (PE_06) was found to be highly informative (PIC ≥0.5), 12 loci (60%) were moderately informative (0.5 < PIC ≥0.25), and the remaining seven were less informative (PIC < 0.25) ( Table 3). The highest observed heterozygosity (Ho) (0.97), the lowest fixation index (− 0.75) and the highest value for gene flow (Nm) (35.1) were recorded for PE_02 ( Table 3). Ten of the twenty loci showed a highly significant deviation from HW-equilibrium over the entire populations (Table 3).

Genetic variation within and among populations
Among the 12 populations studied, not much differences were observed in terms of a number of genetic diversity paramters, such as effective number of alleles (Ne), observed heterozygosity (Ho) and expected heterozygosity (He), gene diversity (GD) and Shannon diversity index (I). However, Wen, WS and Aw populations scored higher values in Ne, GD and I while SwSh and HKT populations scored a higher Ho as compared to the other populations. Five populations: Aw, Gur, EW, WSh, and YeL, in the order of magnitude, scored slightly less than the mean Ho whereas IAB, Jim, and GG populations had a mean Ho value. Although Ho value is the lowest for Gur population, it comes first in terms of allelic richness (Ar) including richness in private alleles (Arp) with Ar of 3.33 and Arp of 0.28 (Table 4). Jim and WSh populations ranked second and third in terms of overall Arp. The mean number of distinct multi-locus genotypes (MLGs) and index of clonality (G/N) over the entire samples (populations) was 13.08 and 0.55, respectively ( Table 4). The analysis of percentage of polymorphic loci (PPL) showed that at least 90% of the loci were polymorphic in each population studied, with a mean PPL of 94.2% (Table 4).

Population genetic differentiation and gene flow
The hierarchical AMOVA was conducted without grouping the populations as well as by grouping the populations according to administrative regions (AR) and geographical regions (GR). In all cases, variation within individuals accounted for at least 97% of the total variation. The variation among populations, AR and GR accounted for only 3, 2 and 2% of the total variation, respectively. Hence, most of the within population variation is due to the heterozygosity of the individuals within each population. The overall F ST value was very small (0.03) ( Table 5, Additional file 3). The overall gene flow among the populations was estimated to be high (Nm = 5.84) ( Table 3) whereas pairwise population differentiation test computed according to Weir [30], with exclusion of null alleles, ranged from 0.01 to 0.07.

Genetic distance between the populations
Nei's standard genetic distance between populations ranged from 0.02 to 0.05. The highest pairwise genetic distance (0.05) was observed between four pairs of populations (Aw vs WS, Wen vs HKT, WS and YeL). The mean genetic distance of each population from the other populations ranged from 0.02 to 0.04 and, in terms of this parameter, Wen is the most distantly related population with a mean genetic distance of 0.04 (Table 6). Similarly, Weir's estimation of population differentiation (F ST ) [30,48] ranged from 0.01 to 0.07 with the highest value recorded between Wen and YeL populations.

Cluster analysis, PCoA, and population genetic structure
The neighbor-joining based cluster analysis of 60 individual samples randomly selected across the 12 populations (five samples per population) resulted in four major clusters (C1, C2, C3, and C4) with the first three clusters further divided into two sub-clusters. Each of the four clusters comprised individual plants from different collection zones (geographic regions). However, samples are more or less grouped according to their geographic region of origin at sub cluster levels (designated as i and ii on each major cluster) although there is considerable intermixes (Fig. 2). UPGMA [36] cluster analysis with bootstrap tests [39] has been conducted using Nei's standard genetic distance at population level.
The 12 populations formed four major clusters (I, II, III and IV) with cluster IV comprising eight populations that formed two sub-clusters (i and ii) (Fig. 3).
A PCoA analysis revealed that the majority of samples were placed at the center of a two-dimensional coordinate plane (Fig. 4) forming roughly three groups (C1, C2 and C3), showing poor population structure. The first three axes explained 32% of the total variation.
The Bayesian approach-based assignment of the 287 individual plants to different populations and determination of their population structure [45], using STRUC-TURE outputs, predicted K = 3 to be the most likely number of clusters (Fig. 5a). Based on this value, Clumpak result (bar plot) showed wide admixtures and hence there was no clear geographic origin-based structuring of populations (Fig. 5b).

EST-SSR markers validation and polymorphism evaluation
P. edulis is an indigenous tuber crop of Ethiopia that plays an indispensable role in food security of subsistence farmers in areas where it is cultivated. As the first step in the development of genomic tools and resources that can promote the conservation and breeding of this crop, we developed and validated 20 polymorphic EST-SSR markers. This work reports the transferability of SSR markers from P. barbatus to P. edulis and hence it enriches the limited reports on cross-genera [49] and   cross-species [50] transferability of molecular markers, and molecular marker based genetic diversity analyses of Lamiaceae species. In the present study, the screening of 3263 P. barbatus ESTs resulted in 301 sequences (9.2%) containing SSRs. This proportion suggest that EST-SSRs are less abundant in P. barbatus than in Salvia miltiorrhiza (14.7%) [51] and Lavandula species (18.8%) [49], and slightly more abundant in P. barbatus than in Mentha piperita (8.4%) [52], which are all Lamiaceae species. The results also suggest that the abundance of the EST-SSRs in P. barbatus is higher than their abundance in cereal crops such as rice (4.7%), sorghum (3.6%), barley (3.4%) and maize (1.4%) [53]. However, other factors such as the SSR search criteria and size of the dataset might have partly contributed to the differences [54]. A maximum of two alleles are expected per individual plant at single copy microsatellite loci in diploid species. In the present study, a maximum of two alleles per plant were detected at each of the 20 loci, indicating that they are all single copy and have disomic inheritance regardless of the thus far reported basic chromosome number and ploidy level for several members of the genus Plectranthus [55]. However, the chromosome number of P. edulis has not been reported and, hence, further in-depth cytogenetic analysis is important to confirm the finding of this study.
The EST-SSR markers developed in the present study contained di-, tri-penta-and hexanucleotide repeats. Studies have shown that di, tri and tetra-nucleotide repeat SSRs are the most commonly used motifs in molecular genetic studies [56]. Tri-nucleotide repeat motifs were relatively abundant. This is attributed to the fact that they are more frequent in the EST's coding regions, unlike in non-coding regions, in almost all taxa studied [54,[57][58][59] because of the positive selection for specific amino-acid stretches [60] or the prevalent selection against frameshift mutations in these regions for dinucleotides and other non-triplet repeat motifs [61]. As length and total size of perfect array of microsatellites increases, the frequency of repeats decreases and hence the informativeness increases [59,62] owing to the higher mutation rates in longer microsatellites [63]. In agreement with this, the average number of repeats in dinucleotide SSRs is higher (8.7) than trinucleotides SSRs (5.7) among the SSRs developed in the present study. Similarly, the informativeness in terms of a number of alleles, GD and PIC was higher for di-nucleotide repeats than trinucleotides suggesting higher rate of evolution for SSRs with shorter repeat motifs than SSRs with longer repeat motifs.
The use of molecular markers for efficient selection of genotypes with desirable traits and enhancing the efficiency of breeding by allowing effective simultaneous selection of various desirable traits is a well-established approach [64,65]. Hence, the large number of alleles detected in the present study suggests the suitability of microsatellites in general and those developed in this study in particular for genetic linkage and QTL mapping of desirable traits followed by marker assisted selection (MAS) in breeding programmes. However, most of the alleles were rare and scarce suggesting minimum selection pressure against the alleles. Otherwise, clonally propagating crops are expected to bear less proportion of such alleles as compared to seed-propagating crops. Moreover, the higher number of private alleles observed at several loci (Example: PE_07, PE_16, PE_18, PE_08) could offer a good opportunity to evaluate P. edulis genetic materials for the association of particular alleles with traits of interest and for conservation. Such alleles are useful in comparing diversity between species or populations [66] and also for measuring genetic distinctiveness of individuals in a population [67].
The average percent polymorphism per population revealed in the present study across the 20 loci (94%) is by far greater than the level reported by Kumar et al. [52] for 13 accessions of M. piperita (61%), which shares the same family with P. edulis. However, it was similar with that reported for 28 alfalfa accessions (97%) [68] and 37 Opium poppy accessions (96%) [69]. Such high percent polymorphism together with the PIC values obtained, which provides an estimate of the discriminatory power of a locus [70], and the allelic diversity suggest great potential of the markers for use in future genetic studies. However, the informativeness of a considerable number of loci is low and hence, there is a need to develop more highly informative EST-SSRs or other type of DNA markers that are suitable to characterize P. edulis genetic resources for efficient conservation and breeding.
The loci studied displayed differences between Ho and He in which half of them showed excess heterozygosity that led to a significant departure from HWE across populations. Such excess heterozygosity is expected in historically outcrossing species that maintain their heterozygosity through vegetative propagation, or if other factors such as natural and artificial selection pressure favor heterozygosity. Similar results have been reported in sweet potato [71,72].

Population genetic diversity
Higher genetic diversity is expected in larger and older populations when compared with small and newly established ones because of higher levels of accumulated and maintained genetic variation [73] which is important in increasing fitness and therefore reduces the likelihood of local extinction [74]. However, the mean observed heterozygosity (0.39), Shannon's information index (0.61) and Nei's gene diversity (0.35) obtained in the present study showed a medium level of genetic variation within populations. This could be mainly due to a relatively narrow genetic basis of the populations that resulted from limited germplasm resources accessible to farmers, or due to reduction in population size both due to natural as well as human factors, such as replacing cultivation of P. edulis by other crops. In addition, farmers' preferences for selected traits of economic importance such as tuber size, tuber skin color, tuber texture, maturity etc. and asexual mode of reproduction of the crop (clonal propagation), which was evidenced from the considerably higher global clonality index (G/N), could have contributed to limited genetic variation in P. edulis populations. There are similar reports on potato cultivars from Yunnan province, China [75].
Wen, WS, and Aw populations are genetically more diverse than the other populations as estimated by parameters such as gene diversity, heterozygosity and Shannon diversity index and hence the areas representing these populations could be considered as genetic diversity hot spots and a potential in-situ conservation sites for P. edulis. Among the populations, EW has the least genetic diversity, which might suggest current rapid genetic erosion from the area (population bottleneck) or intensive artificial selection pressure to maximize tuber yield. In terms of allelic richness, Gur, Jim, WS and WSh populations are the top four in that order, and hence are more interesting in terms of genetic and evolutionary studies on this crop because allelic richness is more informative in this regard as it is sensitive to the presence of rare alleles [76] (which is prominent in this study) and population bottlenecks when compared to other parameters such as expected heterozygosity. Moreover, these populations except WS bear a relatively high proportion of private alleles which may indicate certain level of independent evolution of their gene pools that allowed maintenance of private alleles at a population level [77].

Population genetic differentiation
AMOVA revealed that P. edulis has very low genetic differentiation among populations, which accounted only for 3% of the total genetic variation. The result is in line with previous reports on clonally propagating crops, such as Ensete (Ensete ventricosum) [78,79], as such species tend to be more diverse within populations (but largely lower than sexually reproducing species) than among populations. Likewise, F ST averaged across all loci (F ST = 0.03) and pairwise F ST for all pairs of populations (highest value = 0.07) was generally low to moderate on the bases of Wright [80] and Hartl and Clark [81] suggestions. Wright [80] indicated that genetic differentiation among populations can be considered high if the value of F ST is greater than 0.25. This could be partly attained if gene flow, which is a powerful force to decrease differentiation among populations, is low (Nm < 1) [82]. Hence, the present study showed that P. edulis has very little population sub-structuring. The low population differentiation is supported by high gene flow (mean Nm = 18.29) owing to step-wise pollen movement across populations, germplasm exchange in the form of tubers and seeds through sharing common markets among several of the adjacent areas where different populations were collected. This study also showed the minimal effects of regions or geographic origins of populations on genetic variation in P. edulis. This could be partly explained by the extensive exchange of tubers as planting materials among farmers (gene flow), common origin of the populations, the clonally propagating nature of the crop in which only a limited number of individuals contribute tubers to the next generation, which gradually leads to recent or old population bottlenecks and hence facilitate genetic drift.
A pair-wise population genetic differentiation analysis resulted in a seven-fold variation in F ST value among pairs of populations (ranging from 0.01 to 0.07). The highest population differentiation was observed between Wen and HKT, Wen and WS, Wen and YeL as well as between Aw and WS populations. Wen population showed the highest (0.05) pairwise Nei's standard genetic distance with WS and HKT populations and is the most genetically distinct population with a mean Nei's standard genetic distance 0.04 (Table 6). This can be partly explained by the fact that Wen and Aw populations were collected from a relatively pocket location and are separated from the other populations with a relatively longer geographic distance that probably restricted recent seed and tuber exchange. Hence, these populations may serve as potential sources of new genetic variation of important traits that can be used in breeding programs.

Population genetic relationship and structure analysis
Neighbour joining cluster analysis in which each population is represented by five individual plants revealed a weak clustering pattern confirming low genetic differentiation among the populations and suggesting that the genetic background of P. edulis populations does not always correlate with their geographical origin. Although UPGMA and PCoA analyses also showed a certain level of populations clustering according to their geographical regions, the clustering pattern is weak to support the concept of "isolation by distance" [80]. Similarly, structure analysis revealed a close relationship (weak sub-division) among the samples from the 12 collection zones, and in general, three inferred groups (K = 3) with potential admixtures and have been observed. It is interesting to indicate that all individual plants analyzed have alleles originated from the three clusters, which supports the presence of a strong gene flow that led to poor population differentiation.

Conclusions
In this study, we developed 20 EST-SSR markers for P. edulis based on EST sequences of P. barbatus deposited in the GenBank. All the markers were polymorphic in the populations studied and are valuable genetic tools to help evaluate the extent of genetic diversity and population structure of not only P. edulis but also of various species in the family Lamiaceae. These markers detected a larger number of alleles, some of which were private alleles that may be linked to important agronomic traits. The study also showed the potential of EST-SSR markers in defining how P. edulis genetic diversity is structured, and hence contribute to the development of better in-situ and ex-situ management strategies as well as selection criteria for the germplasm to be used in breeding programs for the improvement of various desirable traits in this crop. Among the 12 administrative zones and woredas considered, Wenbera, Awi and Wolaita have populations with a relatively high genetic diversity, and hence can be considered as hot spots for in-situ conservation of P. edulis as well as sources of desirable alleles for breeding values. Further studies that include germplasm from the remaining administrative zones and combine molecular characterization with agro-morphological analysis would be important to reveal additional potential sites for conservation and development of best-performing varieties. Overall, this study offered baseline information that promote further studies to exploit the high economic and endogenous values and to stop and reverse the current rapid genetic erosion of P. edulis.