Microsatellite-based phylogeny of Indian domestic goats
© Rout et al. 2008
Received: 25 July 2007
Accepted: 28 January 2008
Published: 28 January 2008
Skip to main content
© Rout et al. 2008
Received: 25 July 2007
Accepted: 28 January 2008
Published: 28 January 2008
The domestic goat is one of the important livestock species of India. In the present study we assess genetic diversity of Indian goats using 17 microsatellite markers. Breeds were sampled from their natural habitat, covering different agroclimatic zones.
The mean number of alleles per locus (NA) ranged from 8.1 in Barbari to 9.7 in Jakhrana goats. The mean expected heterozygosity (He) ranged from 0.739 in Barbari to 0.783 in Jakhrana goats. Deviations from Hardy-Weinberg Equilibrium (HWE) were statistically significant (P < 0.05) for 5 loci breed combinations. The DA measure of genetic distance between pairs of breeds indicated that the lowest distance was between Marwari and Sirohi (0.135). The highest distance was between Pashmina and Black Bengal. An analysis of molecular variance indicated that 6.59% of variance exists among the Indian goat breeds. Both a phylogenetic tree and Principal Component Analysis showed the distribution of breeds in two major clusters with respect to their geographic distribution.
Our study concludes that Indian goat populations can be classified into distinct genetic groups or breeds based on the microsatellites as well as mtDNA information.
Summary of goat breeds sampled
Mathura, Agra (UP)
Medium size, known for both milk and meat
Chakarnagar, Etawah (UP)
Large size known for milk production
Small size breed known for meat and skin quality
Best fibre producing breed
Jhakarana, Behror, alwar (Rajsthan)
Large size known for milk production
Desh-Nokh, Bikaneri (Rajsthan)
Large size breed known for meat, milk and coarse fibre
Tonk, Udaipur (Rajsthan)
Large size known for meat and milk
An assessment of genetic variability in domestic goats is a first step towards conservation of genetic resources for maintaining breeding options. In the changing phase of agricultural practices, a few breeds have been used on a large scale for immediate economic gain. Therefore, locally adapted native breeds have been neglected or displaced without knowing their genetic importance. DNA markers have been used to study the genetic variation in livestock, human and other populations [3–5]. Genetic markers are used to determine genetic variation between breeds; subsequently relationships among breeds are determined in calculating genetic distance and constructing trees. Microsatellite markers have been used as good tools to analyse the genetic variation in cattle, sheep, pig, and goats [6–11]. Using the maternally inherited mitochondrial DNA (mtDNA) sequence information, we have demonstrated earlier the existence of all the three previously known lineages A, B, C and proposed two new lineages among Indian goat populations . However, no study has been conducted on Indian goats using the nuclear microsaltellite/short tandem repeats (STR) loci and covering a wide set of populations from different agroclimatic zones on Indian goats. Therefore, we have characterized seven economically important Indian goat breeds using STR markers.
STR markers its localization, allele range along with annealing temperature
Number of alleles identified
58° for 30 sec
55° for 15 sec
54° for 20 sec
60° for 30 sec
61° for 15 sec
58° for 45 sec
58° f or 20 sec
Oar HH 56
63° for 15 sec
55° for 30 secs
55° for 30 secs
58° for 45 secs
57° for 30 secs
50° for 30 sec
55° for 15 sec
55° for 15 sec
60° for 30 secs
55° for 15 sec
Measures of genetic variability in Indian goats
Total number of alleles (TNA)
The mean number of alleles per locus (NA) varied from 8.1 in Barbari to 9.7 in Jakhrana goats. The mean number of alleles per locus (NA) corrected for sample size (calculated based on n = 31) is presented in Table 3. The comparison between both estimates was different in some breeds due to variation in sample size. The most diverse goat breeds were the Jakhrana and Sirohi, which had the highest total number of alleles (TNA) of 165 and 162 and highest mean number of alleles (MNA) of 9.7 and 9.3, respectively. The least diverse breed was the Pashmina, which had the lowest TNA of 129 and the lowest MNA of 7.6. Average and expected heterozygosity was lowest in the Pashmina. Similarly the Jakhrana and Sirohi had the highest expected heterozygosity of 0.783 and 0.782, respectively. However, the Marwari had higher observed heterozygosity than the Jakhrana and Sirohi. Deviations from Hardy-Weinberg Equilibrium (HWE) were statistically significant (P < 0.05) for 5 loci breed combinations. These loci included one each in Barbari (ILSTS 005), Jamunapari (ILSTS 005), Black Bengal (SRCRSP10), and Pashmina (Oar HH56) and Marwari (ETH 225). However, the total number of significant deviations was below the 5% level in each population.
FST values for each pair of populations varied from 0.036 to 0.088. The average GST values over all loci was 0.080, indicating that a 8.0% of total genetic variation corresponded to differences among populations, whereas 92.0% was explained by difference among individuals. The average RST value (based on the stepwise mutation model) over the loci was 0.177. Mean pairwise comparisons between breeds showed that RST values were 2–4 folds higher than GST values. An exact test for population differentiation for all pairs of breeds across all loci showed that all breeds were significantly (P < 0.001) different from each other.
AMOVA analysis of Indian goat breeds based on microsatellite DNA variation
Source of variation
Degree of freedom
Sum of squares
Percentage of variation
Nei's DA genetic distance matrix and Pairwise Fst distance between seven Indian goat breeds (Fst above diagonal and DA distance below diagonal)
On the other hand, BM4621 and ILST005 participated in the construction of only one or two axes. ILSTS005 participated in the construction of the first axis. The corresponding PCA plot indicates that Pashmina and Barbari breeds were isolated from some other populations, as in the fist axis of the global analysis, but the clusters Jakhrana, Jamunapari, Black Bengal and Sirohi, Marwari were not exhibited by this marker. On the other hand, BM4621, which contributes to the construction of the second and third axes, exhibited these clusters, but did not isolate the Barbari and Pashmina breeds. BM4621 revealed three clusters (Pashmina, Barbari, other breeds) in contrast to the four clusters exhibited by the global analysis (Pashmina, Barbari, Black Bengal and Jamnuapari, Jakhrana).
Our genetic analysis of seven Indian goat breeds with 17 microsatellite markers showed higher gene diversity as compared to European and Asian goat breeds. Barker, 1994  and Takezaki and Nei, 1996  suggested that microsatellite loci for genetic diversity studies should have more than four alleles in order to reduce the standard error estimates of genetic distance. The total number of alleles per locus in the present study ranged from 9 to 22. This higher number of alleles for each locus suggested that all the markers used were appropriate to analyse diversity in Indian goats. The mean number of alleles observed over a range of loci in different populations was considered to be a reasonable indicator of genetic variation within the populations . A more appropriate measure of genetic variation within a population was gene diversity (average expected heterozygosity) . Gene diversity for each breed ranged from 0.724 in Barbari to 0.783 in Jakhrana. Takezaki and Nei, 1996  determined that for markers to be useful for measuring genetic variation, they should have an average heterozygosity ranging from 0.3 to 0.8 in the populations. This again confirmed that these markers were appropriate for measuring genetic variation. By analysing mitochondrial HVR1 region in our previous study , the lowest haplotype diversity was observed in Pashmina goats (0.926) and the highest haplotype diversity was observed in Jamunapari goats. Microsatellite analysis also revealed that the Pashmina goats exhibited the lowest diversity as compared to other breeds in the present study. The measures of population differentiation indicated variability within breeds and the exact test for population differentiation indicated significant differences between breeds. We observed a large, significant difference between expected and observed heterozygosities in all the Indian goat breeds. This large difference indicates that there is a considerable degree of genetic subdivision within breeds. In India goats are exploited very little by artificial selection, but mtDNA analysis has established new lineages in Indian goats as compared to other goat breeds of the world .
As no systematic effort has been made to create distinct goat breeds in India, founder effects and genetic drift may have played major role in differentiation of Indian goat breeds. Population subdivision in Indian goats is also supported by the average proportion of genetic differentiation among breeds (8.03%). As RST values shows the fraction of total variance of allele size between populations, the estimated RST was more than twice the size of GST and FST, suggesting that goat breeds differ in both allele frequency and allele size. In addition, AMOVA indicated that 6.59% of the total genetic variation is between breeds of goats, confirming higher within population diversity in the Indian subcontinent. Mitochondrial DNA analysis in Indian goats showed 83% of variation within breeds and 17% among breeds . The between breed variation in Swiss goats was 17%  using microsatellites. Similarly mtDNA analysis showed about 10.7% variation among the goat breeds from Africa, Middle East, Asia and Europe .
Takezaki and Nei, 1996  have shown that the construction of a phylogenetic tree depends on the type of population and number of markers used to analyse the population. It has been also showed that increasing the number of loci does not necessarily enhance the reliability of the phylogeny . Takezaki and Nei, 1996  have demonstrated that DA and DC are the most efficient means of obtaining a correct tree topology on the basis of microsatellite analysis when within population variation is high and distances between each pair of populations are used to build a NJ tree.
The genetic data revealed that the smallest distance is between Marwari and Sirohi and the largest distances are between Pashmina and Black Bengal, Barbari and Black Bengal, and Barbari and Jakhrana. The highest geographical distance between Black Bengal and Pashmina corresponds to the highest genetic distance. The phylogenetic analysis indicated that breeds were grouped according to their geographic locations except Barbari goats. A similar observation of population clustering according to their geographic origin has been reported in cattle . This shows that geographically adjacent populations are more genetically related. The principal component analysis supported the grouping of animals and the distance between breeds was significant. However Barbari goats showed a deviation as they did not cluster with geographically close breeds such as Sirohi, Jamunapari and Jakhrana. The Barbari did not group with any one of the neighboring breeds, consistent with variation in morphological characteristic (ear pattern) and presence of two new lineages by mtDNA analysis. Moreover Jamunapari, Jakhrana and Black Bengal clustered in one group, indicating a shared gene pool, motivating further analysis to establish their migration and origin through a coastal route.
Breeds cluster according to their geographic location. Similar population clustering according to geographic location was previously observed in microsatellite analyses of humans , cattle  and chickens . Mitochondrial DNA analysis in goats also indicated geographical clustering in the breeds . The result indicated that geographically adjacent populations were more genetically related, perhaps due to founder effects and interbreeding near bordering areas. PCA revealed separation of breeds more clearly between populations of different geographical locations. Some breeds which showed a close phylogenetic relationship are separated more clearly by the PCA plot. Mt-DNA markers are extremely informative for predicting the conformation of the gene pool from maternal inheritance. The Jamunapari is a breed of semi arid regions and isolated in small pocket of Chambal ravines . The Barbari is a goat breed of semi arid regions and distributed over a wide breeding tract. Pashmina goats are found in high altitude Himalayan regions and known for fibre quality. Similarly the Black Bengal is the most prolific breed from the eastern part of India and distributed over a wide area. The Jakhrana is also from semiarid regions of Rajasthan and adapted to a specific locality. The Marwari and Sirohi are breeds of arid regions of western India and distributed over a large breeding tract. The differential existence of breeds varying from hot humid to hot arid, hot humid to cold humid and isolated to particular regions is illustrated by breeds such as the Jamunapari and Jakhrana. PCA isolated the Barbari from other breeds indicating their differential origin, consistent with ear characteristics and existence of two new lineages by mtDNA analysis. Principal component analysis showed clustering of goats according to their geographical origin. Although the breeding tracts of goats are overlapping and no strict breeding policy is adopted to maintain standard breeds in the region, they still maintain genetic distinctness in their natural habitat. Therefore it is necessary to combine genetic data with geographical positioning and to assess the genetic relationship by geostatistical models in further studies.
In conclusion, this analysis showed that microsatellites as well as mtDNA analysis can be used to classify Indian goat populations into distinct genetic groups or breeds. Phylogenetic and principal component analysis showed the clustering of goats according to their geographical origin. Although the breeding tracts of goats are overlapping and they are spread over all the parts of the country, they still maintain genetic distinctness in their natural habitat.
A total of 302 goats representing 7 major breeds of India were sampled from their natural habitat. The breeds studied have been grouped into 5 different types based on their utility and size (Table 1). Summaries of goat breeds sampled from their natural habitat are described in Table 1.
About 10 ml of blood samples were collected from each animal's jugular vein using EDTA vacutainers and stored at 4°C until DNA isolation. An effort was made to collect samples from unrelated individuals based on information provided by farmers. The geographical distribution of the breeds from different regions sampled for the study is described in Table 1. DNA was isolated from blood as described elsewhere .
The markers were chosen from the existing bovine, ovine and caprine genetic maps [20–22] with an effort to cover all chromosomes having high heterozygosity. Twenty-three STR markers were included in this study for analyzing the variation among various goat breeds (Table 2). All the DNA samples were analysed with 23 STR markers. Each 10μl PCR reaction mixture consists of 10 ng of template DNA, 1× buffer, 200 μM dNTPs, 2.5 mM MgCl2, 1 U of AmpliTaq Gold (Perkin Elmer) and 10 pM of each primer. Amplification conditions for these markers were as follows: Initial denaturation at 95°C for 10 min. followed by 95°C for 1 min, specific annealing temperature for each marker as given in Table 2 and 30 s at 72°C for 30 cycles. An final extension temperature of 72°C for 5 min was used for each reaction.
One micro liter of PCR amplicons and 0.5 μl of size standard (GS-ROX500) were mixed with 2.5 μl of loading dye (formamide: bluedextrin; 5:1), denatured (94°C for 2 mins) and electrophoresed in 5% Long Ranger (FMC) gel, using ABI 377 automated DNA sequencer (Perkin Elmer). GeneScan (Perkin-Elmer) software was used to analyse the gel image and Genotyping software (Perkin-Elmer) was used to get the allele size of each amplicons.
Exact tests for deviations from Hardy-Weinberg equilibrium (HWE) were performed by the GENEPOP Package . The program performs a probability test using a Markov Chain (dememorization 10,000, batches 100, iteration per batch 1000). Genetic disequilibrium was estimated between all locus pair using GENEPOP. The mean number of alleles per locus (NA), total number of alleles (TNA), the observed heterozygosity (Ho) and the expected heterozygosity (He) under HWE were computed using FSTAT  and AGArst software . To test sample bias, Allelic richness based on minimum sample size (n = 31) was estimated using FSTAT . Significance levels of difference in NA and He between populations were tested using a Wilcoxon's signed ranks test.
Analysis of molecular variance (AMOVA), Fst and pairwise difference were computed using ARLEQUIN ver 3.11 [26, 27]. A second estimator of gene differentiation, RST, was calculated which accounts for variance in allele size and was defined for genetic markers undergoing a stepwise mutation model. The DA genetic distance  was computed with DISPAN  to establish genetic distance between populations. A NJ/UPGMA tree was constructed in comparison for Indian goat breeds using the PHYLIP Package Ver3.6 . The significance of population difference was tested using the exact test of population differentiation proposed in GENEPOP software based on allele frequency variation.
As admixture in the breeding tract of goats is very common and some populations are known to be admixed, we used a multivariate procedure to represent population relationships. Multivariate procedures are recommended in this situation because the admixed populations are not original evolutionary units and may be misrepresented by phylogeny-based tree-building techniques . Prior to multivariate analysis, we tested for congruence of loci following the two-step procedure developed by Moazami-Goudarzi and Laloe, 2002 . This is because a consensus representation of population relationships is not meaningful if single markers are not congruent, as would occur if many of the distances among populations based on individual loci were negatively correlated . Euclidean distance matrices between all populations were first generated for each locus. Then, a Kruskall-Wallis test  on rank scores of standardized distances between populations was carried out and a significant Kruskal-Wallis test indicated that a compromise structure exists because distances are unequal between populations. If a compromise structure exists, then a multivariate analysis, such as principal components analysis will be meaningful. PCA was then done using the allelic frequencies as variables. It leads to a representation of populations as a cloud of points in a metric space. Comparison between the inertia of single-marker enables to compare the typological value of the markers. Inertia can be split up according to axes and/or loci . All computations were done using the R software . More specifically, computations relative to PCA were done with the ade4 package .
PKR and KT gratefully acknowledge the financial support of Department of Biotechnology (Animal Biotechnology Task Force), Government of India, New Delhi. We thank Alkes Price for editing the manuscript.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.