Genetic diversity, multiplicity of infection and population structure of Schistosoma mansoni isolates from human hosts in Ethiopia

Background Human intestinal schistosomiasis caused by Schistosoma mansoni and urinary schistosomiasis caused by Schistosoma haematobium are endemic in Ethiopia. Although schistosomes look morphologically uniform, there is variation in infectivity, egg productivity and virulence due to variation in their genetic make. Knowing the genetic diversity and population structure of S. mansoni isolates will enable to understand and consider the possible variability in terms of infectivity, egg productivity and virulence. Methods Between 2010 and 2011, genetic diversity and population structure of Schistosoma mansoni isolates from four endemic areas of Ethiopia was assessed using previously published 11 polymorphic microsatellite loci. Miracidia were hatched from eggs of S. mansoni collected from stools of human subjects residing in Kemissie, Wondo Genet, Ziway and Sille-Elgo villages. DNA was extracted from single miracidium and PCR was run following standard protocol. Allelic polymorphism and population genetic structure was analyzed using different software. Result At a population level (i.e. different villages), the mean number of alleles per locus, allelic richness, expected heterozygosity in Hardy–Weinberg equilibrium and pairwise FST values ranged from 8.5 to 11.5, 3.46–20.8, 0.66–0.73 and 3.57–13.63 %, respectively. All analyzes on population genetic structure reveals strong genetic structuration corresponding to the four sampled villages. At infrapopulation level (i.e. different hosts) the mean number of alleles per locus, allelic richness, expected heterozygosity in Hardy–Weinberg equilibrium and FIS values ranged from 3.09 to 7.55, 1–1.96, 0.59–0.73 and 0.1763–0.4989, respectively. Mean estimated genetically unique adult worm pairs within hosts ranged from 66 to 92 % revealing the occurrence of infection of a single host with genetically unique multiple S. mansoni strains. The data also indicated the occurrence of genetic variation within inter- and intra-hosts. Conclusion High level of genetic diversity and significant population differentiation characterized the S. mansoni isolates of Ethiopia. These results are quite different from previous studies demonstrating that it is difficult to generalize schistosome transmission patterns because epidemiological situation tends to vary. These are important factors to be considered in relation with morbidity, drug resistance or vaccine development.


Background
Human intestinal schistosomiasis caused by Schistosoma mansoni is endemic in Ethiopia. Varying in distribution and magnitude of disease burden, S. mansoni is widely distributed throughout the country and hence is a major public health problem [1][2][3].
Schistosomes reproduce sexually reproduction in their definitive hosts (humans, other primates, rodents) and this type of reproduction allows for reassortment and the perpetuation of parasite genotypic diversity. The eggs laid by the adult female worms pass in the host's faeces and each will hatch in water to release a miracidium. These free-swimming larvae must find and penetrate an appropriate freshwater snail, in the case of S. mansoni, a snail of the genus Biomphalaria. Once it has penetrated the snail, the miracidium transforms into a mother sporocyst which produces multiple daughter sporocysts through asexual reproduction. These in turn produce cercariae which are released into water and are infective to human hosts, thus completing the life cycle.
Although S. mansoni is considered morphologically uniform, it is known that strains from the same or different geographical locations have shown differences in egg production, infectivity, pathogenicity and susceptibility to chemotherapy [4]. These characteristics could be due to difference in the population genetic structure of schistosomes. A previous study had indicated the presence of genetic diversity in S. mansoni parasites [5]. Study on genetic composition of natural populations of the parasite S. mansoni in northern Senegal using nine microsatellite markers revealed a random distribution (panmixia) of parasite genetic variation among villages and hosts, confirming the concept of human hosts as 'genetic mixing bowls' for schistosomes [6]. Though host sex and village of residence did not show any association with parasite genetics, host age was significantly correlated with parasite inbreeding and heterozygosity, with children being more infected by related parasites than adults. The study suggests that host-specific factors, such as age and concomitant immunity, may shape the genetic composition of schistosome populations, revealing important insights into host-parasite interactions within a natural system [6][7][8]. Thus elucidating the distribution of parasite genetic diversity is critical to the understanding and prediction of disease epidemiology. One of the primary reasons for studying parasite population genetics is to understand demographic parameters, such as gene flow and population size, which are not readily observable using conventional ecological methods. These insights allow inferences regarding the patterns of parasite transmission and recruitment within the environment [9]. Moreover, by using molecular tools, studying the population genetic structure will enable to determine whether changes in gene frequencies provide insight into the effectiveness of treatment, understand the impacts of treatment on the gene pool and population structure of Schistosoma parasites, and establish whether movement of humans from refugia or non-treated areas introduces new parasites into local populations [10,11]. A meta-analysis of eight Schistosoma mansoni (two published and six unpublished) microsatellite datasets collected from individual schistosome-infected school-children showed that S. mansoni populations were more diverse in East than West African schools, but heterozygosity levels did not vary significantly with geography [12]. Genetic structuring was also detected in Schistosoma mansoni and Schistosoma haematobium populations from different countries in sub-Saharan Africa, indicative of isolation by distance [12]. In other studies of African countries there is a link between schistosome infection intensity, transmission and parasite genotype and the genetic structure of worm populations [5]. Although considerable epidemiological studies have been conducted in Ethiopia, no work has been done on molecular characterization of S. mansoni so far. Our previous epidemiological survey performed on three villages from northern, central and southern Ethiopia shows considerable variation in terms of prevalence and intensities of infections [1]. The prevalence of Schistosoma mansoni infection among the study participants in Kemissie, Wondo Genet and Sille-Elgo was 89.6, 59.9, and 31.6 %, respectively. The highest and geometric mean of egg per gram of stool for Kemissie, Wondo Genet and Sille-Elgo was, 5208 and 346, 8472 and 252, 3960 and 91, respectively [1]. Therefore, due to different epidemiological patterns and infection intensities of S. mansoni observed across these villages, in this present study it was hypothesised that there will be substantial difference in genetic diversity between isolates of Ethiopian S. mansoni populations.

Study area
The study was conducted in four geographically distant Schistosoma mansoni endemic areas namely: Wondo Genet about 261Km south of Addis Ababa, located at 07°05′35″N, 038°36′66″E at an altitude of 1755 m above sea level, Ziway about 164Km away from the capital Addis Ababa, located at 07°56′37″N, 038°43′25″E at an altitude of 1642 m above sea level, Sille-Elgo about 525Km away southwest of Addis Ababa located at about 05°28′39″N, 037°26′02″E at an altitude of 1188 m above sea level, and Kemissie about 305 km northeast of Addis Ababa located at 10°43′30″N, 039°04′20″E at an altitude of 1450 m above sea level. Work permission was obtained from local administrative officers, health offices and school principals.

Study design
From 2010 to 2011 a cross sectional parasitological study was conducted in the four study areas in order to determine the prevalence and intensity of Schistosoma mansoni infection. Among those found with high intensity of Schistosoma mansoni infection, stool specimens were collected for the second time to harvest egg and hatch miracidia. Each single miracidium was used to determine genetic diversity and population structure of Schistosoma mansoni isolates from the four endemic study areas.
Stool collection, examination, and miracidia hatching Small plastic sheets were distributed to voluntary study participants and sizable stool specimens were collected and examined using Kato-Katz method (41.7 mg template) [13]. Infection status was determined by the presence or absence of Schistosoma mansoni eggs. Stool specimens were collected from 16 S. mansoni positive subjects in Sille-Elgo, 30 subjects in Wondo Genet, 30 subjects in Kemissie and15 subjects from Ziway, totaling 91. The stool specimens were kept in 0.85 % saline in vial and transported in ice box to the Medical Parasitology Laboratory of Aklilu Lemma Institute of Pathobiology, Addis Ababa University, to harvest miracidia.
In order to stimulate hatching of miracidia, stool samples were homogenized with saline and sieved through tiered sieve of 425, 180 and 140 μm mesh size and kept for about 20 min in dark in order to allow the eggs to settle in the bottom of the flask. The supernatant was poured and the eggs were put in 250 ml flask filled with aged water. The flasks were exposed to artificial light in order to initiate hatching. The flasks were covered with black carbon paper and aluminum foil. This induces the positive phototropic and negative geotropic characteristics of the miracidia which results in their accumulation on the top of the flask [14]. From those specimens that hatched a total of 379 miracidia from 52 patients were collected. These included 81 miracidia from seven individuals in Sille-Elgo, 88 miracidia from 19 individuals in Wondo Genet, 151 from 20 individuals of Kemissie, and 59 from six individuals in Ziway. The miracidia were transferred individually in 2 μl of water using micropipette into Eppendorf tube under a dissecting microscope. Single miracidium was put in Eppendorf tube in 96 % ethanol at −20°C until processed in the laboratory of Centre de Biologie et d'Ecologie Tropicale et Méditerranéenne, University of Perpignan Via Domitia, France.

DNA extraction
DNA extraction from Schistosoma mansoni miracidia was done following Beltran et al. [15] protocol. In brief, before DNA extraction, miracidia were individually vacuum-dried for 15 min in a Speedvac evaporator. Following, 20 μl of NaOH (250 mM) was added to each tube. After a 15 min incubation period at 25°C, the tubes were heated in boiling water at 99°C for 2 min. Then, 10 μl HCl (250 mM), 5 μl of Tris-HCl (500 mM) and 5 μl Triton X-100 (2 %) were added and a second heat shock in boiling water at 99°C for 2 min was performed. The products were put in room temperature until processed for Polymerase Chain Reaction (PCR).

Polymerase chain reaction
Previously published 11 polymorphic microsatellite markers, namely SMDA28, SMC1, SMDO11 and AF325698 [16], R95529, SMD57, L46951 and SMD25 [17], SMBR16 and SMBR10 [18] and SMS7 [19] were used to determine the genetic variation and Schistosoma mansoni population structure of the Ethiopian isolates. To maximize efficiency and minimize cost, these PCRs were performed in three multiplexes [20]. The PCR amplifications loci: R95529, SMC1, SMBR16, SMD57, SMDO11 are Multiplex1; SMDA28, SMS7, SMD28 are Multiplex2, and SMBR10, L46951, SMD25 are Multiplex 3. The PCR reactions were carried out in a total volume of 20 μl containing 4 μl of 5X buffer (10 mM Tris-HCl, pH 9.0 at 25°C, 50 mM KCl, 0.1 % Triton X-100), 0.2 μM of each oligonucleotide primer, 200 μM of each dNTP (Promega), 1 unit of GoTaq polymerase (Promega, Madison, Wisconsin), 1 μl of extracted DNA and DNase-free water q.s.p. 20 μl. The PCR program consisted an initial denaturation phase at 95°C for 5 min, followed by 40 cycles at 95°C for 30 s, 57°C annealing temperature for 20 s, 72°C for 30 s, and a final extension at 72°C for 10 min in a thermocycler (Bio-Rad, Hercules, USA). For each marker, the forward PCR primer was 5 fluorescein labeled (Proligo, Cambridge, UK) allowing a precise analysis in an automated DNA sequencer. A mix of 40 μl sample loading solution (Beckman Coulter, Villepinte, France) and 0.1875 μl DNA size 400, a red labeled size standard (CEQTM DNA size standard kit, 400 Beckman Coulter), was prepared and 0.75 μl of the microsatellite PCR products were diluted in 39.25 μl sample loading solution. Mineral oil was dropped in each tube and electrophoresed using an automatic sequencer (CEQTM 8000, Beckman Coulter) with CEQTM 8000 sequence analysis software. The sizes of the alleles were calculated using the fragment analyzer package [20]. All loci were tested pairwise based on 4400 permutations and adjusting P value to 0.000227 and there was no linkage disequilibrium detected.

Data analysis
In this study, miracidia from all the patients within a single study site were treated as a population while all miracidia in a single host are treated as infrapopulation.
The allele frequencies were calculated using the program MICROSATELLITE TOOLKIT (software available upon request). Both the expected and observed heterozygosities were also calculated using MICROSAT-ELLITE TOOLKIT and their statistical significance tested using the chi-square test at α =0.05. FSTAT was used to test for deviations from Hardy-Weinberg equilibrium using exact tests, testing the hypothesis that observed diploid genotypes are the product of random union of gametes. An exact test for linkage disequilibrium between pairs of loci was performed using the FSTAT. Mean estimates of F IS (inbreeding coefficient) for each population and pairwise F ST (between all population pairs) were also calculated following the method of Weir and Cockerham [21]. Deviation of F IS and F ST values from zero was tested using a permutation test. All F statistics were carried out using FSTAT 2.9.3.2 [22]. Isolation By Distance (IBD) was tested correlating genetic distance (Fst/(1-Fst)) and geographic distance in kilometer. The paired t-test is used to compare two sample means where there is a oneto-one correspondence (or pairing) between the samples while Friedman's test was used for ordinal data or an interval-scale variable that is not normally distributed [23]. Genetic structuration was assessed using both Principal Component Analysis (PCA) using Genetix software [24] and bayesian approach using Structure software [25]. Full-sib analyses was assessed by estimating mean number of genetically unique adult worm pairs and its standard deviation for each patient using Colony software [26].

Ethical consideration
The study was ethically approved by the Institutional Research and Ethics Committee of the Department of Microbial, Cellular and Molecular biology, Addis Ababa University, P.O. Box 1176, Addis Ababa, Ethiopia and by the National Research Ethics Review Committee of Federal Republic of Ethiopia Ministry of Science and Technology, P.O. Box 2490, Addis Ababa, Ethiopia. Informed verbal consent was obtained from all adults. For school age children younger than 18, informed verbal consent was obtained from their parents through health extension workers and school principals. In addition, the children also gave their assent. All study participants found positive for S. mansoni were treated with Praziquantel at a dose of 40 mg/Kg body weight.

Population level
Out of the 379 miracidia collected from 41 Schistosoma mansoni positive individuals (nine from Kemissie, 19 from Wondo Genet, seven from Sille-Elgo and six from Ziway) a total of 288 were successfully genotyped for 11 loci and analyzed at population level.
One hundred sixty four alleles were scored in all of the four populations for all of the 11 loci examined. There was no null allele detected. Individually a total of 127, 123, 102 and 94 alleles for all of the 11 loci were counted for Kemissie, Wondo Genet, Sille-Elgo and Ziway, respectively. The number of alleles scored for each locus ranged from 4 to 22 (SMC1-SMDO11) with a mean value of 8.5 in Ziway; from 3 to 25 (SMD28-SMDO11) with a mean value of 11.5 in Kemissie; 4-22 (SMC1-SMDO11) with a mean value of 11.2 in Wondo Genet and, 3-23 (SMD28-SMDO11) with a mean value of 9.3 in Sille-Elgo for the 11 loci. The number of alleles in all of the four populations ranged from 8 to 34 (SMD28-SMDO11) with a mean value of 14.9 (Table 1). Nonparametric Friedman test showed significant difference in number of alleles counted within the populations (χ 2 = 10.941 at 3DF; P = 0.012).
Paired t-test was used to evaluate the statistical significance of the deviations between the observed and expected heterozygosity (t(10.367), 43DF, P = 0.000) ( Table 3). The expected heterozygosity was 73 % in Sille-Elgo followed by Wondo Genet (71 %), Kemissie (69 %) and Ziway (66 %). Similarly the observed heterozygosity was 52 % for Sille-Elgo 49 % for Wondo Genet, 44 % for Kemissie and 37 % for Ziway. In all of the study populations F IS has a high positive value different from zero for all of the 11 loci while only Sille-Elgo and Ziway had a respective negative value at the loci SMDA28 and SMBR10 ( Table 4). The mean F IS value for all of the populations ranged from 0.27853 to 0.4347. Statistical test showed that there is high significant difference between the expected and observed heterozygosities (χ 2 = 32.818, 1DF, P = 0.000).
Spatial genetic structuration or distribution of Schistosoma mansoni isolates in this study was determined by both Principal Component Analyses (PCA) using Genetix software [24] and Bayesian approach using Structure software [25]. The first two axes of the PCA (46.9 and 34.9 % of the total variation for principal component 1 and 2, respectively) split miracidia into four groups of points corresponding to the four villages (Fig. 1a). Only some miracidia from Kemissie are in the Wondo-Genet or Ziway group of points. Each patient could be affected to its own village (Fig. 1b). Similar result was obtain using the Bayesian approach, showing the maximum probability for four clusters (Fig. 2). Cluster 1 represents the structure of Ziway with 77 % of its own and sharing about 23 % from the rest. Cluster 2 represents Kemissie with 78 % of its own and 22 % sharing with the rest. Cluster 3 represents Sille-Elgo with 87 % of its own and sharing 13 % from the rest. Cluster 4 represents Wondo Genet having 68 % of its own and sharing 32 % with the others.

Infrapopulation level
Twenty four infrapopulations were analyzed for the number of alleles counted. The scored number of alleles in individual study subjects which were represented by five or more miracidia also had shown variation ( Table 7). The number of allele count at a single locus for the 24 infrapopulations range from 1 to 17. The largest mean number of allele was 7.54 and the lowest was 3.09.
Allelic richness of Schistosoma mansoni isolates at individual host level ranged from 1 to 1.96 for each locus (Table 8). For all of the 24 infrapopulations, the allelic richness for each locus was in the range of 1.42-1.93 with a mean value of 1.76. The lowest and highest allelic richness observed was at SMD28 and SMDO11 loci, respectively. There is no statistically significant variation in allelic richness of Schistosoma mansoni among each infrapopulations (χ 2 = 12.023 at 23DF; P = 0.970).
For the 24 infrapopulations a total of 247 miracidia were analyzed at 11 loci level for their heterozygosity (Table 9). It was observed that, the value is higher for the expected heterozygosity (59-73 %) than the observed heterozygosity (28-59 %). Paired t-test showed statistically significance deviations between the observed and expected heterozygosity (t(18.091), 23DF, P = 0.000). At the infrapopulation level all the F IS are positive. Fourteen on 24 values are significantly different from zero.
Colony analysis in Kemissie revealed an estimated total of 37 genetically unique adult worm pairs, representing 88 % of the pairs (Table 10). Twenty six pairs are shared among the eight patients sampled in Kemissie. Similarly, there were an estimated total of 36 genetically unique

Discussion
In the current study high level of S. mansoni genetic polymorphism was observed as evidenced by a large number of alleles detected in each population and infrapopulation. Variation in the number of alleles counted from a single host at the 11 loci level suggests that the degree of heterozygosity is highly variable among the study subjects. In a Kenyan study a relatively higher value, compared to the current study, was reported [5].
In a comparison study of infrapopulations genetic diversity between two villages in Brazil, Thiele et al. [7] reported much higher allele count than the current study. This high genetic diversity of Ethiopian S. mansoni isolates can also be attributed to coinfections by multiple genotypes from genetically different cercariae. Schistosome genetic diversity within molluscan host populations has been characterized in previous studies [27]. These previous studies showed that the biology of the schistosomes is such that dispersal of the parasite is dependent on the host dispersal and the dissemination of the free larval stages (miracidium and cercariae). The large geographical distance separating the four study sites is likely to limit contacts between populations, thus promoting schistosome population differentiation among them. Allelic richness scored was higher for Kemissie followed by Wondo Genet, Ziway and Sille-Elgo. The nonparametric Friedman test showed that there is no significant difference in allelic richness among the four populations. However, Wilcoxon rank test indicated that Sille-Elgo had the lowest allelic richness, showing low genetic variation of S. mansoni isolates of Sille-Elgo compared to the others. Similar finding was reported in a Kenyan study [5]. It is interesting to notice that the gradient in genetic diversity (higher for Kemissie followed by Wondo Genet, and Sille-Elgo) is the same as the gradient we previously observed for prevalence and intensities of infection [1]. However, because only three populations were sampled we cannot statistically validate a link between parasite genetic diversity and parasite virulence.   In the current study there was statistically significant difference between the expected and observed heterozygosities showing significant departure from Hardy-Weinberg equilibrium. The FIS value for all loci in the four populations is significantly greater than zero, indicating heterozygote deficiency. However, Sille-Elgo and Ziway had less than zero value at the SMDA28 and SMBR10 loci, respectively, indicating heterozygote excess. A study in Brazil reported a bit similar result to our finding [28]. A study by Agola et al. [5] showed no statistically significant difference between expected and observed heterozygosity. In a recent study performed in Senegalese population, similar positive Fis values has been measured, in both population (i.e. village) and Fig. 1 a Principal components analysis at population level. First two principal components (PCs) are shown here. Each miracidia is represented by one dot and the color label corresponding to their self-identified population origin (Kemissie, Sile-Elgo, Wondo Genet, Ziway). The percentage of the variation in genetic distances explained by each PC is 46.9 and 34.9 % for PC1 and PC2, respectively. b PCA by patient (each point represent one patient). The percentage of the variation in genetic distances explained by each PC is 16.9 and 11.5 % for PC1 and PC2, respectively infrapopulation (i.e. human) levels [6]. The authors propose that positive Fis values at infrapopulation level may be a consequence of sib transmission (i.e. person visiting the same transmission site frequently). This sib transmission is particularly true for children compared to adults [6]. Because our sampling includes children younger than 14 years old a similar sib-transmission explanation agrees with Fis values we observed. In context of strong sib transmission a significant genetic structure would be expected as we have observed in our study. In Van den Broeck et al. [6] study the authors did not observed spatial structure and proposed that this is a consequence of high host mobility. We could thus hypothesis that the mobility of host is reduced in the present study compared to previous study in Senegal.
All the methods we used in this study (Fst calculation, PCA or Bayesian approach) show a clear and significant genetic differentiation, supporting distinctions among the four populations, thus implying that there is restricted gene flow among the schistosomes under study [29]. In contrast to this finding, a study in central Kenya showed that the PCA lacked clear geographical patterns suggesting the absence of strong substructure within the S. mansoni population [5]. However, in the current study the genetic structuration of S. mansoni population was not associated with isolation by distance. Guadeloupe island [30], Brazil [7], and Kenya [5,20] had shown that the genetic structuration of S. mansoni population was not associated with isolation by distance which in agreement with our finding. As described by Agola et al. [5] the probable reason for the absence of association between genetic diversity and isolation by distance in this study could be due to a combination of factors that include restricted gene flow between populations since the areas are located in geographically distant sites, sib-transmission, local adaptations and systematic variations in environmental conditions. Another explanation may be the low number of studied sites does not   2  4  2  3  3  3  4  3  3  2  3  3  3  3  1  3  5 5 2   SMD28  2 2 1 3 2  2  2  2  3  3  3  3  1  1  3  1  1  2   allow showing significant statistical link between genetic and geographic distances. The estimation of genetically unique adult worm pairs within a single host proves the occurrence of infections with genetically different S. mansoni strains within a single host. An investigation on the genotypic composition of S. mansoni for its adult stages within the definitive host (the wild rat, Rattus rattus) and for the larval stages within the intermediate host (the snail, Biomphalaria glabrata) both collected at the same transmission site was conducted by Theron and colleagues [31]. The result showed that intramolluscan larval infrapopulations were characterized by a low infection rate (0 · 6 % on average) and low intra-host genetic diversity (1 · 1 genotype on average per infected snail), while adult infrapopulations within rats showed a high infection rate (94 %) and a substantial intra-host genetic diversity (34 genotypes on average) linked to high intensities (160 worms per host on average). Analysis of the genetic data allowed them the identification of various ecological, behavioral and immunological factors which are likely to enhance transmission of multiple parasite genotypes towards the vertebrate hosts. This identification of infection of both the intermediate and definitive hosts with genetically different S. mansoni strains is in favor of the current finding where multiplicity of infection occurred in human hosts based on sibship determination. Colony analysis in Kemissie revealed an estimated total of 37 genetically unique adult worm pairs, with a mean of 88 % and 26 shared worm pairs among eight individuals. Similarly, there were an estimated total of 36 genetically unique adult worm pairs, with a mean of 73 %, and 19 worm pairs shared among the four individuals in Sille-Elgo. In Wondo Genet there were an estimated total of 32 genetically unique adult worm pairs, with a mean of 92 %, and 15 worm pairs shared among the nine individuals. In Ziway, a total of 24 genetically unique adult worm pairs with a mean of 66 % were observed and there were 4 worm pairs shared among the three individuals. This result showed high levels of infection of single host with genetically different multiple strains of S. mansoni isolates.