The present study reports the findings of the analyses of DNA sequence variability of two trait specific genes in indigenous rice varieties in the Eastern Himalayan region of NE India. The Wx gene is associated with amylose synthesis, which determines the glutinous or nonglutinous nature of rice grains. The OsC1 gene is involved in the synthesis of anthocyanin and associated with coloration of the apiculus in rice grains. Rice varieties used in this study include glutinous and nonglutinous as well as colored and colorless apiculus types collected from a broad geographic area covering most of the NE India.
The present study revealed that previously identified mutations do not exclusively contribute to the corresponding phenotypes in rice varieties. For example, the glutinous nature in most rice varieties is considered to be a result of a G to T mutation at the 5′ splice donor site of exon 2 of the Wx gene [18, 22]. In the present study, three of the five glutinous rice varieties carried the G to T mutation at the Wx gene, while this mutation was not detected in two of the five glutinous rice varieties. On the other hand, one of the 25 non-glutinous rice varieties carried the G to T mutation, while maintaining the non-glutinous phenotypes. This finding suggests that alternative genes or genomic regions other than the ones previously reported are associated with the glutinous and nonglutinous phenotype of the cultivated rice. Similarly, several reports indicated a correlation between variation in amylose content and the number of repeats in the microsatellite region within the Wx gene [37, 38]. Although the present study also reports the occurrence of highly variable microsatellite locus within the Wx gene, there was no direct correlation between the number of repeats and the glutinous nature of rice grains.
Analyses of the OsC1 locus also revealed similar patterns. The colorless apiculus in rice varieties is often attributed to a 10 bp deletion in the OsC1 gene . Although 17 of 21 varieties with colorless apiculus included in the present study had the 10 bp deletion in the OsC1 gene, five varieties without the corresponding 10 bp deletion showed the colorless phenotype. Similarly, eight varieties without the 10 bp deletion showed colored apiculus phenotype as expected, whereas one of the varieties with the 10 bp deletion showed the colored apiculus phenotype. Thus, apiculus color phenotype of 18% of indigenous rice varieties in NE India did not correspond to the reported apiculus color determining genotype of the OsC1 gene.
One of the varieties with colorless apiculus phenotype (Mimutim) had the 10 bp deletion in the R3 region, and showed the G to C nucleotide change resulting a substitution from Lysine to Aspartic acid possibly contributing to the observed colorless phenotype. Another colorless apiculus variety (Bashful) without the 10 bp deletion showed an amino acid change from Proline to Arginine in exon-1 suggesting that this mutation could be associated with the coloration of the apiculus. However, the other three colorless apiculus varieties (Borua Beroin, Lahi and Borjahinga), which lack the 10 bp deletion in exon-3, did not carry the Proline to Arginine amino acid change suggesting that other genomic regions also play a role in determination of the phenotype of the apiculus color. The mutation at the position 845 of the exon-3, which substitutes Alanine to Valine in three varieties and (Tilbora, Kawanglawang and Balam) and O. rufipogon showed no effect on the phenotype of the apiculus color, suggesting that the substitution of an amino acid with similar hydrophobicity at this position does not affect the apiculus color phenotype. Overall, these observations suggest that multiple genomic regions are involved in determining a particular phenotype. There are several examples of involvement of multiple genes or interacting loci in determination of the phenotype [24, 39, 40]. Two of the SNPs, C to G mutation at position 122 in exon 1 and G to T mutation at position 845, have already been identified in a previous study . The G to C mutation at position 60 in exon 1 is reported for the first time in this study.
It is generally considered that the domestication process reduces the nucleotide diversity at domestication related genes that control specific traits selected during the domestication. In other words, genes that regulate a particular trait under positive selection during domestication and improvement process may imprint ‘signatures of selection’ in the form of typical patterns of reduced nucleotide diversity . This is evidenced by much lower levels of nucleotide diversity among glutinous rice at the Wx gene as compared to the nonglutinous rice varieties [24, 41]. Similar observations of reduced levels of nucleotide sequence polymorphism in the nonshattering sh4 allele in the cultivated rice varieties as compared to wild progenitors , and reduced diversity in the ramosa1 gene in cultivated maize as compared to the wild teosintes that control branching architecture in the tassel and ear  have been reported. However, the present study revealed higher levels of nucleotide diversity (πtot = 0.0053) in the glutinous type varieties than in the nonglutinous type varieties (πtot = 0.0043) at the Wx locus. This could be attributable to the fact that Wx gene, which has been associated with the glutinous nature of rice, may not be the sole gene that determines the glutinous phenotype. This phenotype is likely controlled by multiple loci. This finding is is further supported by the fact that the Wx intron 1 splice donor site mutation (G to T) is also found in some nonglutinous rice varieties reflecting that this mutation is not necessarily responsible for the expression of glutinous phenotype [5, 44]. These findings are in agreement with other studies, which showed that interaction of other genes (e.g. dull genes) may modify the phenotype of the Wx gene  or other dull genes . Teng et al.  suggested that allelic variation at Wx gene may not necessarily regulate the starch properties in different rice varieties. The linkage association study also showed an interplay of multiple genes in determining starch physicochemical properties in rice .
Although selective sweeps may drastically reduce nucleotide diversity in target genes such as Wx locus , the diversifying selection due to environmental heterogeneity and local cultural preferences favoring other traits may increase nucleotide diversity . The existence of diverse agroclimatic conditions, and various cultural practices of indigenous communities may have played a significant role in the maintenance of high levels of diversity in glutinous varieties of rice in NE India.
In the present study, positive values of Tajima D values were detected for the glutinous and non-glutinous varieties (Table 4) except for small regions of the Wx gene that showed negative values among nonglutinous varieties (Figure 4). Since the values of Tajima’s D were not significantly different from zero, the overall distribution of nucleotide diversity falls within the neutral expectations (Table 4). Since demographic changes including population expansion or reduction may influence all regions of the genome equally, the differences in Tajima D within and between loci could be attributable to selection trends during the domestication process. Therefore, regions of the gene that shows positive Tajima D value could be attributable to balancing or overdominant selection, whereas the regions of gene with negative Tajima D value could be associated with the purifying selection. Signature of positive selection shown in McDonald and Kreitman test at the Wx gene may be linked to some traits of ecological adaptation into diverse agroclimatic conditions. The deviations detected in various analyses are not significantly different from neutral expectations and conforms that selection pressure associated with both traits are weak. Similar results have also been reported in previous studies in rice  and maize [13, 14]. The total of 16 haplotypes detected at the Wx locus is lower than the previously reported 18 haplotypes among 37 glutinous and 68 nonglutinous rice accessions from Asia . However, the 16 haplotypes reported in our study are different than haplotypes found in the previous study. There was no clear haplotype based partitioning of the rice varieties into glutinous and nonglutinous varieties. Haplotype analysis based on Wx locus showed that haplotypes H1 to H5 formed a distinct cluster consisting of only indigenous varieties and could serve as a valuable material for future genetic improvement programs. Although number of haplotypes varied when indels were considered in the network analysis, there was no clear grouping based on phenotypes.
The OsC1 gene showed lower levels of polymorphism and reduced nucleotide diversity among the colorless apiculus varieties as compared to colored apiculus varieties. The low level of nucleotide diversity is common in genes related to selected phenotypes [24, 42]. Sliding window analysis of the nucleotide diversity showed that most regions of reduced nucleotide diversity in OsC1 gene were same between colored and colorless apiculus phenotypes (Figure 6). Such concordant loss of diversity could be attributable to population bottleneck during the domestication .
The evidence for selection among colorless apiculus varieties is detected through high dN/dS ratio at the OsC1 locus (Table 4). As this gene is associated with synthesis of anthocyanins, which has multiple functions including plant defense responses and signalling in plant-microbe interactions [25, 26], selection of this gene among the cultivated rice varieties can not be ruled out. The negative values of the Tajima D values indicate an excess of rare alleles (Table 4) at the OsC1 locus among the colorless apiculus varieties suggesting a possibility of purifying selection. It has been found that colorless apiculus varieties possessed more negative D values in the coding regions compared to the colored apiculus counterpart. These patterns are consistent with a recent selective sweep at the OsC1 gene among the colorless apiculus rice varieties. Translation of the coding regions of OsC1 gene revealed that the sequences with the 10-bp deletion within the third exon drastically reduces the protein size from 272 amino acid to 206 amino acid. This might have significant impact in expression of the OsC1 gene and regulation of apiculus coloration in rice.
The haplotype analysis revealed nine different haplotypes among the colored and colorless apiculus varieties. The number of detected haplotypes is about 50% less than the previously reported haplotypes (17) among 39 wild and cultivated rice . On the other hand, only two haplotypes reported in Saitoh et al.  were detected in our samples and the remaining seven haplotypes were unique to our study. These haplotypes formed two major groups of rice varieties. However this grouping did not correspond to apiculus coloration. Similar results were also obtained when gaps were included in in the analysis. One group showed affinity with the agronomically improved varieties and the other group consisting of only indigenous varieties formed a separate cluster.