Identification and mapping of yield and yield related QTLs from an Indian accession of Oryza rufipogon

Background Cultivated rice (Oryza sativa L.) is endowed with a rich genetic variability. In spite of such a great diversity, the modern rice cultivars have narrow genetic base for most of the agronomically important traits. To sustain the demand of an ever increasing population, new avenues have to be explored to increase the yield of rice. Wild progenitor species present potential donor sources for complex traits such as yield and would help to realize the dream of sustained food security. Results Advanced backcross method was used to introgress and map new quantitative trait loci (QTLs) relating to yield and its components from an Indian accession of Oryza rufipogon. An interspecific BC2 testcross progeny (IR58025A/O. rufipogon//IR580325B///IR58025B////KMR3) was evaluated for 13 agronomic traits pertaining to yield and its components. Transgressive segregants were obtained for all the traits. Thirty nine QTLs were identified using interval mapping and composite interval mapping. In spite of it's inferiority for most of the traits studied, O. rufipogon alleles contributed positively to 74% of the QTLs. Thirty QTLs had corresponding occurrences with the QTLs reported earlier, indicating that these QTLs are stable across genetic backgrounds. Nine QTLs are novel and reported for the first time. Conclusion The study confirms that the progenitor species constitute a prominent source of still unfolded variability for traits of complex inheritance like yield. With the availability of the complete genome sequence of rice and the developments in the field of genomics, it is now possible to identify the genes underlying the QTLs. The identification of the genes constituting QTLs would help us to understand the molecular mechanisms behind the action of QTLs.


Background
The modern day cultivars of rice, in spite of all their high yielding potential and other desirable features are handicapped with narrow genetic base for most of the agronomically important traits including the dwarf habit, which is the major yield enhancing trait. Recent study of high yielding Indian rice varieties for their ancestry revealed that hardly 5 to 6 accessions accounted for more than 90% of their genetic constitution, confirming that the cultivar gene pool being depended on now for improvement represent hardly 15% of the total genetic variability available in rice germplasm (E A Siddiq, personal communication). Rice is endowed with very rich genetic diversity. Wild/weedy species along with very large number of primitive cultivars and landraces constitute an important reservoir of useful genes. The size of additional variability they can provide would be of great value to the ongoing crop improvement endeavor. Large genetic variability still remains untapped in the wild relatives and primitive cultivars of rice [1]. Considering the large hidden variability and very rare and agronomically important genes they possibly possess, utilization of the wild species is critical to future crop improvement [2]. Utilization of these exotic species as donors in interspecific crosses is one of the strategies to harness their hidden potential and broaden the genetic diversity of the existing gene pool. Over the last decade, wild species in rice have been successfully utilized for introgression of diverse traits such cytoplasmic male sterility (cms) [3][4][5][6], abiotic and biotic stress [7][8][9][10][11], yield and its components [12][13][14][15][16][17][18] and grain quality [19][20][21] into the cultivars. A great deal of work in the recent past, on the wild species of rice, concentrated on the utilization of these species for quantitative traits such as yield and its components long with grain quality. In the first ever report on the use of wild species for introgression of quantitative characters, two yield QTLs, yld1.1 and yld 2.1, each of which is capable of increasing yield by about 18% have been identified in a Malaysian accession of O. rufipogon [12,13]. This was a precursor to many studies resulting in the identification of numerous QTLs pertaining to yield and grain quality [12][13][14][15][16][17][18][19][20][21]. Keeping in view the unlimited potential of wild/weedy species of rice for yield genes as evident from the foregoing research, the present study reports the identification and mapping of molecular marker-associated yield QTLs in an Indian accession of O. rufipogon (IC 22015). An interspecific testcross population, derived using an advanced backcross QTL strategy (AB-QTL) [22], between O. rufipogon and IR 58025A, a widely used cms line in India, was used to map QTLs related to yield and it's components. The AB-QTL method has been successfully employed earlier in tomato and rice to transfer positive alleles from phenotypically inferior wild and weedy species into elite cultivars [23][24][25][14][15][16][17][18][19]. In addition to identifying potential novel QTLs for yield and it's components, the results from the current study will provide additional data for comparison with QTLs that are previously documented in rice. Comparisons across different genetic backgrounds will provide information about the conservation of QTLs and help us to understand the interactions of QTL alleles across multiple backgrounds and environments.

Trait analysis and field performance
The phenotypic analysis of the 251 testcross families showed that the frequency Distribution of all traits approximately fit normal distribution ( Figure 1). As expected in an interpsecific cross, character wise frequency distribution of testcross families showed transgressive seg-regants for all the traits. For a depiction of variation in tiller number and panicle length in the testcross families, see additional file 1. The average grain yield of the testcross families was 6.08 t/ha, with the range varying from 3.90 to 9.45 t/ha, while yield per plant ranged from 7.5 to 36.0 g with an average of 19.5 g. Thirteen testcross families outperformed the hybrid check, KRH2, by more than 20% for plot yield and as many as 39 families showed more than 20% increase in yield per plant as compared to KRH2 (Table 1). Of the 251 testcross families studied in all, 75 showed at least 20% increase over KRH2 for three or more yield components.

Trait correlations
The trait correlations confirmed to the expected results. Interestingly GW had no significant effect on PY (Plot yield), but showed negative correlation with NP. For detailed character pair correlations among the traits see additional file 2.

Marker polymorphism
Two hundred and ten microsatellite markers were used to screen the parents for identifying polymorphic markers. Eighty markers (38%) detected polymorphism. The polymorphism is lower compared to earlier studies involving O. rufipogon, where the polymorphism ranged from 60-90% [13,15,17]. Polymorphism is a measure of genetic diversity and varies with the parental combinations used. Earlier studies using a Malaysian accession of O. rufipogon (IRGC 105491) have indicated varying frequencies of SSR polymorphism with indica (~60%) [13,17] and japonica (90%) [15] recurrent parents. The lower percentage polymorphism may be due to a higher degree of genetic similarity between O. rufipogon and O. sativa used in this study compared to those used earlier.

QTL analysis
A total of 39 QTLs were identified using composite interval mapping (CIM) and interval mapping (IM). CIM analysis detected fewer QTLs (25 QTLs) than IM (31 QTLs). While 17 QTLs (43.58%) were detected by both the methods, IM identified 14 QTLs (35.89%) exclusively and 8 QTLs (20.51%) were only detected by CIM (Table 3). Single marker analysis identified a total of 45 QTLs for the 13 traits studied [See additional file 3]. Forty two out of the 45 QTLs identified by single marker analysis were either identified by CIM or IM, so these will not be discussed separately. Three QTLs, sf1.1, spp1.1 and hi1.1 were only identified by SMA. The variation in the number of QTLs detected by different methods has been previously reported for interspecific crosses involving O. rufipogon [15,17]. The 39 QTLs were distributed on chromosomes 1, 2, 3, 5, 8 and 9 ( Figure 2).

Interaction among QTLs
A two-way test to detect epistatic interactions between marker loci was performed using the EPISTAT software [26]. The analysis identified a total of 15 interactions consisting of 20 markers spread across 8 different chromosomes (Table 4). These markers did not contribute to the phenotype singly but had a significant effect on the phenotype in combination with another marker indicating strong G × G interactions. This may be one of the reasons for the transgressive segregants obtained.

Marker segregation
The allele frequency in a BC 2 population without selection would be 87.5% IR 58025A alleles to 12.5% O. rufipogon alleles. Twenty three markers (28.75%) were skewed towards one or the other parent resulting in an allele frequency of 83.26% IR 58025A alleles to 16.74% O. rufipogon alleles. Ten marker loci (12.5%) were skewed towards O. sativa parent, whereas, 13 markers (16.25%) had over representation of O. rufipogon alleles. Skewness of markers towards one of the parents has been documented for interspecific as well as intersubspecifc crosses in rice [14][15][16][17][18][19], [27][28][29]. A comparison of the results with earlier studies involving O. rufipogon revealed that the percentage of skewed markers was lower compared to that reported by Moncada et al [15] (37.6%) and Thompson et al [18] (42.5%) and higher compared to Septiningsih et al (21.4%) [17]. All the three previous studies used same accession of O. rufipogon (IRGC 105491)) but different recurrent parents. This suggests that the polymorphism percentage is relative and depends on parental combination. Skewness towards the elite parent could have been due to the intensity of selection imposed in the BC1 generation, while, skewness towards O. rufipogon may be due to reduced recombination and linkage drag in some regions of an interspecific population [30,31]. While, segregation distortion of RM251 and RM7 on chromosome 3 may be due to their proximity to the gamete abortive gene, ga3 [32], the deviation from the mendelian ratio of RM249 and RM44 might be due to the presence of these markers close to the centromeres of chromosome 5 and 8  on BC1 for high tiller number, a trait that was superior in O. rufipogon. RM13 was also shown to exhibit segregation distortion towards O. rufipogon in an earlier study [17].

Trait correlations
The present study confirms that major components follow significant positive relationship with yield. Most of the trait correlations confirm with those reported earlier for studies involving O. rufipogon. Grain weight was negatively associated with both spikelet number per panicle and grain number per panicle [13,17,18]. In the present study, the correlation between grain weight and yield was non significant as also reported earlier for an IR64/O. rufipogon cross [17], However other studies on O. rufipogon report a positive correlation between yield and grain weight [13,18]. There is no correlation detected in the present study between panicle number and yield, however, a positive correlation between these two traits was reported in an IR64/O. rufipogon derived cross [17].

O. rufipogon derived QTLs for yield improvement
Oryza rufipogon alleles had a beneficial effect on 74% of the QTLs obtained for yield and yield components in the present study. This is a higher percentage than docu-   rufipogon. However, despite its inferiority for the trait, the alleles from the wild species had beneficial effect on grain weight indicating that the alleles contributing to grain weight might interact positively with the genetic background of IR 58025A.

Interaction among QTLs
An analysis to identify the potential epistatic interactions between marker loci, using EPISTAT software [26], identified 20 markers resulting in 15 two-way interactions ( Table 4). All these markers had no effect on the trait singly but resulted in an enhanced effect when combined with another marker. The resulting G × G interactions between these markers may be one of the reasons for the appearance of transgressive segregants in the population. Several chromosomal regions were associated with more than one trait, indicating linkage or pleiotropic effects. For example, the QTLs gnp2.2 and yldp2.2, associated with an increase in grain number per plant and yield per plant respectively were located in the same region on chromosome 2. Similarly, the region associated with nt2.1 which controlled an increase in number of tillers was linked to np2.1, gnp2.1, yldp2.1, hi2.1 and yld2.1 controlling an increase in number of panicles, grain number per plant, yield per plant, harvest index and plot yield respectively. The O. rufipogon alleles had beneficial effect on all these traits. However, the same region is associated with a negative QTL from O. rufipogon, gw2.3, resulting in decreased seed weight. At a different chromosomal region, O. rufipogon allele associated with a QTL gw2.1, leading to an increase in grain weight is linked to two negative QTLs, sn2.1 and gn2.1, which result in decrease in spikelet number per panicle and grain number per panicle. The reverse is true for the region associated with another QTL for grain weight, gw2.2. This negative QTL from O. rufipogon is linked with two QTLs corresponding to grain number per plant, gnp2.1 and yield per plant, yldp2.1, where the O. rufipogon alleles had positive effect. It is very interesting that the same chromosomal region associated with a positive QTL for grain weight coincides with negative QTLs for spikelet number and grain number and vice versa. As grain weight is negatively correlated with spikelet number and grain number, it is tempting to speculate that the same QTL might contribute to both the phenotypes. Further characterization of this region by fine mapping and identification of genes underlying it will throw more light on whether the same set of genes, regulated differentially, or an entirely different set of genes govern these phenotypes. The association of positive and negative QTLs to the same chromosomal regions was earlier reported for studies involving O. rufipogon where the positive traits for grain weight and panicle length together and panicle length alone were linked with negative QTLs for plant height and broken rice respectively [17,19]. In lieu with the association of the positive and negative QTLs to same chromosomal regions, a careful selection will be needed to avoid negative characteristics in the crop improvement process.   [13,17,18]. However, the same region is associated with a negative QTL for panicle length (pnl) in a study involving another accession (P 16) of O. rufipogon [14]. This indi-  [13] where the O. rufipogon alleles had beneficial effect.

Conclusion
The study while confirming the view that the progenitor species constitute the largest source of still unfolded variability for traits of complex inheritance like yield and its components has helped identify additional novel variability for yield improvement. The novel QTLs identified are good candidates for fine mapping and positional cloning studies, while, the QTLs that are mapped to regions consistent with other studies can be useful for markerassisted transfer of these QTLs. The availability of the complete rice genome sequence and rapid advances being made in the area of genomics will help dissect and characterize yield related QTLs further. Considering the potential of yield influencing new QTLs, more research is warranted to unearth and use more and more novel yield related gene blocks hidden in closely related wild/weedy species and primitive cultivars, if the rice dependent world is to truly attain and sustain food security.

Phenotypic evaluation of mapping population
The 251 testcross families, two parents and checks viz.,

Trait correlations
Correlations between character pairs were computed at p < 0.05 and p < 0.01 in Excel using trait averages.

DNA extraction
DNA was extracted from two months old leaf tissue using the protocol of Dellaporta [42].

Parental polymorphism and linkage map construction
A set of 210 randomly selected microsatellite markers (Donated by Rockefeller Foundation to EAS) spanning all the 12 chromosomes were screened among the O. sativa and O. rufipogon parents. A total of 80 polymorphic microsatellite markers separated by an average distance of 15.37 cM were used to analyze the 251 testcross progeny. Linkage maps were constructed using the Mapmaker version 3.3 [43] following Kosambi Function [44]. Linkage groups were determined using 'group' command with an LOD score of 3.0 and a recombination fraction of 0.5. Order of the markers for each group was determined using 'order' and 'ripple' commands. Assignment of linkage groups to the respective chromosomes was done based on the rice maps developed at Cornell University [18,45].

QTL analysis
QTLs were analyzed using single marker analysis (SMA), interval mapping (IM) and composite interval mapping (CIM). Single marker analysis wasperformed by regression of field performance on marker genotypes using standard analysis of variance (ANOVA) procedure at a statistical threshold of p < 0.01 and assuming regular segregation of wild and cultivated alleles in the testcross families. The proportion of observed phenotypic variance attributable to a particular QTL was estimated as the difference between the mean of the segregants having the O. rufipogon allele and the mean of the segregants that did not have the O. rufipogon allele. The phenotypic variance over the check KRH2 was also calculated in a similar manner. QTL analysis by interval mapping (IM) and Composite interval mapping (CIM) [46] was done using QTL Cartographer 3.0 [47]. The significant threshold value for identification of a QTL (both for IM and CIM) was determined based on permutation tests at a significance level p <0.01 [48]. Based on 1000 permutations for each trait, the threshold for IM and CIM corresponded to minimum LOD score value of 2.5. The proportion of phenotypic variance (R2) and additive effect were determined for each trait. The deviations from the expected mendelian ratio was calculated using MapDisto software [49] and the digenic interactions between marker loci were determined using EPISTAT software [26]. The QTL nomenclature followed was as reported in [50].