Genome-wide response to selection and genetic basis of cold tolerance in rice (Oryza sativa L.)

Background Cold stress is an important factor limiting rice yield in many areas of high latitude and altitude. Considerable efforts have been taken to genetically dissect cold tolerance (CT) in rice using DNA markers. Because of possible epistasis and gene × environment interactions associated with identified quantitative trait loci, the results of these genetic studies have unfortunately not been directly applicable to marker-assisted selection for improved rice CT. In this study, we demonstrated the utility of a selective introgression strategy for simultaneous improvement and genetic dissection of rice seedling CT. Results A set of japonica introgression lines (ILs) with significantly improved seedling CT were developed from four backcross populations based on two rounds of selection. Genetic characterization of these cold-tolerant ILs revealed two important aspects of genome-wide responses to strong phenotypic selection for rice CT: (1) significant over-introgression of donor alleles at 57 loci in 29 functional genetic units (FGUs) across the rice genome and (2) pronounced non-random associations between or among alleles at many unlinked CT loci. Linkage disequilibrium analyses of the detected CT loci allowed us to construct putative genetic networks (multi-locus structures) underlying the seedling CT of rice. Each network consisted of a single FGU, with high introgression as the putative regulator plus two to three groups of highly associated downstream FGUs. A bioinformatics search of rice genomic regions harboring these putative regulators identified a small set of candidate regulatory genes that are known to be involved in plant stress response. Conclusions Our results suggest that CT in rice is controlled by multiple pathways. Genetic complementarity between parental-derived functional alleles at many loci within a given pathway provides an appropriate explanation for the commonly observed hidden diversity and transgressive segregation of CT and other complex traits in rice.


Background
Rice (Oryza sativa L.) is the most important cereal crop and a staple food for more than half the world's population. Although originating in swampy areas of the tropics, rice is now grown globally in diverse ecologies and thus suffers a wide range of abiotic stresses. Low temperature or cold is a worldwide problem limiting rice yield, particularly in northeast Asia and high mountain areas. Rice crops normally suffer two types of cold stress. During the early rice-crop season, low-temperature stress restrains seedling establishment and plant growth/development; at the reproductive stage, cold induces low fertility and poor grain filling of single-crop rice [1]. In China, cold is responsible for average national yield losses of~5% and~2% in the early rice crop of the south and the single-crop rice of most rice growing areas, respectively [2].
Rice genotypes differ considerably in their cold tolerance (CT) [3]. The development of high-yielding, coldtolerant cultivars is the most efficient way to overcome the problem of low-temperature stress. Because of their generally superior CT compared with that of indica rice, subsp. japonica accessions are normally used as donors to confer CT when breeding for improved CT [3]. Nevertheless, most japonica varieties tend to suffer more frequently from cold because they are distributed in high latitude or high elevation regions. Because of the generally low level of genetic diversity within the japonica gene pool, further CT improvement of elite japonica cultivars remains a huge breeding challenge [4,5]. Progress with respect to the improvement of rice seedling CT has recently been made using phenotypic selection and a conventional breeding strategy [6]. Considerable efforts have meanwhile been taken to genetically dissect rice CT using DNA markers, resulting in the discovery and mapping of many quantitative trait loci (QTLs) associated with rice CT [7][8][9][10][11][12][13][14][15][16][17]. Two major QTLs for CT, qCTS4 and qCTS12, have been finemapped onto rice chromosome 4 and the short arm of chromosome 12, respectively [1,18]. Unfortunately, results from these genetic studies have not been directly applicable to marker-assisted selection for improved rice CT owing to possible epistasis and gene × environment interactions associated with the identified QTLs [19].
We have previously reported several successful applications of a large-scale backcross (BC) breeding strategy to improve abiotic stress tolerance in rice [20][21][22][23][24] as well as a forward genetics strategy for QTL discovery and allelic mining of complex traits using DNA markers and introgression lines (ILs) developed from BC breeding programs [25,26]. In this study, we used a set of japonica ILs to demonstrate the utility of this strategy for simultaneous improvement and genetic dissection of rice seedling CT.

Development of cold-tolerant ILs
For the recurrent parent (RP), we used the superior japonica restorer line C418, the male parent of many elite japonica hybrid cultivars commercially grown in northern China. Four cold-sensitive indica lines, Zihui100 (ZH100) from central China, Bg300 from Sri Lanka, Cisanggarung (Cis) from Indonesia, and Manawthukha (MNTH) from Myanmar, were used as donors because the indica gene pool is known to contain huge amounts of hidden genetic diversity for abiotic stress tolerance [20,22,23]. The four bulk BC 2 F 2 populations were screened for seedling CT in the spring of 2003 in LAAS. Approximately 1,000 seeds of each BC 2 F 2 population were sown in a seedling nursery covered by plastic film on April 3, 15 days earlier than normal planting time, yielding~800 seedlings from each population. The plastic film was removed at the 2.5-leaf stage, exposing the seedlings for 20 d to outside low temperatures ranging from 6.7 to 23.0°C that severely inhibited the growth of C418. Only 41 seedlings from four BC populations showed normal growth and obviously better CT than C418. These plants were selected and transplanted individually into the field in mid-May. At maturity, seeds from a single panicle were harvested from each of the selected BC progenies as single BC 2 F 3 lines. Seeds from the 41 BC 2 F 3 lines were planted as individual lines under normal conditions in the LAAS Hainan winter nursery during the winter of 2003-2004. One or more plants in each of the BC 2 F 3 lines that had similar or better yield performances than C418 were visually selected and advanced as BC 2 F 4 ILs. During the following summer and winter of 2004, similar visual selection of BC 2 F 4 ILs was repeated, resulting in a final set of 177 BC 2 F 6 ILs derived from the 41 originally selected BC 2 F 2 plants.

Progeny testing of selected ILs under natural and controlled low-temperature conditions
Two experiments were performed to confirm the seedling CT of the selected BC 2 F 6 ILs. In the first experiment, the 177 BC 2 F 6 ILs and C418 were evaluated under both normal and stress conditions in the LAAS seedling nursery at Shenyang in the early spring of 2005. According to published methods [10,11], soaked seeds of each IL were sown onto a single row plot in the seedling nursery on April 6, 2005. At the 2.5-leaf stage, approximately 15 vigorous seedlings were retained in each plot. The plots were arranged sequentially with three replications for each line. In the low-temperature stress treatment, the plastic film was unveiled to expose the seedlings to the natural low-temperature stress of early spring for 14 d. During this period, the average daily temperature was 12.0°C, and average low and high temperatures were 6.6 ± 2.0°C (ranging from 4.3 to 9.0°C) and 23.7 ± 2.4°C (ranging from 20.9 to 26.8°C), respectively. In the normal control, all plots were completely covered by the plastic film and maintained at an average daily temperature above 20°C. At the end of the treatments, five representative seedlings in each plot under stress or normal conditions were sampled for measurement of their seedling heights (SHs). The sampled seedlings were then dried in an oven at 78°C for 48 h before being measured for seedling dry weight (SDW). Cold response index (CRI) percentages were determined as follows: CRI (%) = (SDW under stress/SDW under normal conditions) × 100.
In the second experiment, the 177 BC 2 F 6 ILs and their parents were evaluated for seedling CT in a growth chamber at LAAS in 2007. Using a previously reported method [7], well-germinated seeds of each line were sown into a two-row plot in a 65 cm × 44 cm × 14 cm plastic box, six lines per box. The boxes were kept in the growth chamber at a constant day/night temperature of 25°C, a 12-h photoperiod, and a relative humidity at 75-80%. We set up three replications of each IL and six replications of C418. Approximately 30 normal seedlings in each plot were allowed to grow until the three-leaf stage. The growth chamber was then slowly adjusted to a constant day/night temperature of 4°C for 7 d and subsequently slowly adjusted back to a constant day/night temperature of 25°C for 4 d. The number of surviving seedlings in each plot was recorded; seedling survival percentage (SP) was calculated from the average of three replications and used as input data for genetic analysis of seedling CT.

Genotyping experiments
To reconstruct genotypes of the originally selected BC 2 F 2 plants, DNA was isolated from bulked fresh leaf tissues of each of the 177 BC 2 F 2:6 ILs using the CTAB method. We tentatively divided the rice genome into 185 well-distributed bins each representing~2 Mb physical distance based on the Rice Annotation Project Database IRGSP-1.0 (http://rapdb.dna.affrc.go.jp/). More than 600 rice anchor simple sequence repeat (SSR) markers were used to survey the parental lines, resulting in 298 SSR markers polymorphic between the RP and donors in 137 different bins across the rice genome. The generated markers were used to genotype the 177 selected BC 2 F 6 ILs (Additional file 1: Table S1). On average, ILs from each BC population were genotyped with 110 polymorphic SSR markers, covering most introgressed donor segments in the ILs.

Detection of functional genetic units (FGUs) and putative genetic networks (multilocus structures) underlying CT in the selected ILs
Drawing on molecular quantitative genetics theory [26], we define two concepts, functional genetic units (FGUs) and the principle of hierarchy. These concepts are based on the two most common types of functional relationships seen among genes functioning in a signaling pathway affecting complex traits. Hierarchy reflects the one-way functional dependency (FD) of downstream pathway genes on their upstream regulators. A FGU represents the mutual FD among a group of genes functioning at each signaling pathway level; it affects phenotype(s) in the manner of a "house of cards", i.e., with complete complementarity. Nonrandom associations (multilocus structures) are predicted to result from these two types of FD between or among unlinked loci within a signaling pathway. Furthermore, upstream FGUs (regulators) in a signaling pathway affecting complex traits are expected to have larger phenotypic effects and thus respond more strongly to selection than downstream FGUs. In our selection experiment, alleles at segregating regulatory loci affecting the target trait were thus expected, in response to selection, to show greater shifts in introgression frequencies (IFs) of the functional genotypes (defined as either one of the two homozygotes or the heterozygote containing the functional allele at the identified loci).
Using the progeny testing data observed from the 177 ILs under controlled low-temperature conditions, contrast tests were performed using SAS PROC GLM [27] to compare differences in SP between ILs and C418. As a result, 30 BC 2 F 2:6 ILs were identified that had significantly higher SPs than C418. Based on the above-described concepts, a FGU could comprise either single loci showing significant excess introgression or an association group (AG) of r (where r ≥ 2) unlinked but perfectly associated loci exhibiting equal introgression in the cold-tolerant ILs selected from each BC population. To detect FGUs and multilocus structures (putative genetic networks) associated with CT in rice, we conducted statistical tests using the genotypic data from the 30 confirmed cold-tolerant BC 2 progeny. First, χ 2 tests were performed to detect whether allelic and genotypic frequencies at individual loci across the genome in the CT ILs from each BC population deviated significantly from Mendelian expectations. Those loci showing significant over-introgression were considered to be putative CT loci [25]. To minimize the probability of false positives, we used a conservative threshold of P < 0.001 for claiming a putative CT locus. Second, linkage disequilibrium (LD) analyses [28] were performed to detect non-random associations or AGs between or among unlinked loci showing excess introgression. For each AG consisting of r (where r ≥ 2) unlinked loci exhibiting equal levels of excess donor alleles co-introgression in confirmed cold-tolerant ILs from each population, r · (r-1)/2 significant pairwise associations would be expected to exist between the r loci. An AG was considered to be significant on the basis of the following thresholds P ≤ 0.005 and D' = 1.0 for each of the r · (r-1)/2 pairwise associations. To reveal the multilocus structure or putative genetic network underlying CT, pairwise gametic LD analyses were performed to characterize the relationships between alleles at all FGUs detected in the cold-tolerant ILs from each BC population. Using the genotypic data of the ILs from each BC population, we calculated the LD statistic [28]D AB ¼p AB −p ApB , wherep AB ,p A , andp B were the frequencies of cointrogressed functional genotype AB and functional genotypes at FGU A and FGU B, respectively. Here, a functional genotype was either the donor homozygote or the heterozygote. For significance testing ofD AB , we used the test statis- where n was the sample size and the normalizedD AB was estimated asD Þ . A multilocus genetic network containing all detected FGUs in the confirmed cold-tolerant ILs was then constructed in two steps. First, all FGUs detected in ILs from a single population were divided into major groups based on the LD results. The criterion used for delineating these groups was that individual FGUs of different IFs within each group were all significantly and positively associated, withDs AB 0 = 1.0, and were either independent or negatively associated with FGUs in different groups. Second, all FGUs within a given group were connected, forming multiple layers based on the principle of hierarchy according to their progressively reduced functional genotype frequencies and inclusive relationships [26]. In addition, to confirm the detected loci, we used the twosample test for proportions in SAS to compare the IF at each identified CT locus between two segregating groups of BC 2 F 6 ILs with contrasting CT phenotypes, either high SP or low SP, all derived from the same selected BC 2 F 2 plants [27]. Detection of a significant difference between these two segregating BC 2 F 6 groups would verify the CT FGUs determined using the previous method. Finally, a bioinformatics search based on the RGAP 7 database and information in Rice Genome Annotation Project data libraries (http://rice.plantbiology.msu.edu) was performed to identify possible positional candidate genes for CT in those genomic regions (mapped markers) where important CT FGUs were detected.

Development of ILs with significantly improved CT
From the~3,200 BC 2 F 2 plants in the four populations subjected to 20 d of natural low-temperature stress, 41 BC 2 F 2 plants were selected. The number of selected plants ranged from 7 in the C418/MNTH population to 15 in the C418/CG population, with an average selection intensity of 1.3% (Table 1). Progeny testing of the 177 derived BC 2 F 2:6 ILs to confirm their CT under more severe cold stress in a growth chamber revealed that 68 BC 2 F 6 ILs derived from 30 of the 41 selected BC 2 F 2 plants had significantly higher SPs and CRI values than C418 (Table 2; Additional file 2: Figure S1). Average seedling SP of the BC 2 F 2:6 ILs was 3.3 times higher than that of C418, ranging from 1.6 times higher for the 62 ILs in the C418/Bg300 population to 4.7 times higher for the 53 ILs in the C418/CG population. Compared with C418, the ILs had significantly increased SH and SDW CRIs under natural lowtemperature stress, respectively 6% and 18% higher (Additional file 2: Figure S1). Thus, 30 of the 41 selected ILs were confirmed to have significantly improved CT over that of C418, including 6 lines each from the C418/ ZH100 and C418/Bg300 populations, 13 lines from the C418/CG population, and 5 lines from the C418/MNTH population ( Table 2).

Genome-wide responses to selection for CT and detection of FGUs for CT
Strong phenotypic selection for CT resulted in significant over-introgression of the donor segments in the 30 verified cold-tolerant BC 2 F 2 ILs. On average, donor introgression was 0.201 in the cold-tolerant BC 2 F 2 ILs (Table 1), 160% more than the value of 0.125 expected in a BC 2 population. This additional introgression was primarily the result of genome-wide excess heterozygosity. The BC 2 ILs had an average donor homozygote frequency of 0.056, slightly lower than expected, and an average heterozygote frequency of 0.289, which was 2.3 times higher than the expected value (Table 1).
χ 2 tests at individual loci and LD analyses using the SSR genotypic data identified 57 loci in 29 FGUs (17 single loci and 12 AGs) across the rice genome at which highly significant over-introgression of the donor alleles was detected in the selected cold-tolerant ILs. The number of identified loci ranged from 9 loci in 5 FGUs in the C418/ZH100 population to 20 loci in 15 FGUs in the C418/Cis population (Table 3; Additional file 1: Table  S2). The average IF of the donor alleles at the 57 CT loci was 0.469, or 3.75 times as much as the expected value of 0.125 (Table 3). These FGUs were distributed in 48  bins (~25%) across the rice genome ( Figure 1; Table 3).

Putative genetic networks (multi-locus structures) underlying CT
Of the 108 possible pairwise gametic-phase LD statistic (D') values calculated between detected FGUs in coldtolerant ILs from each BC population (Additional file 1: Table S3), 31 (28.7%) were statistically significant (Additional file 1: Table S4). The presence of significant non-random associations between or among alleles at unlinked CT loci in ILs possessing extreme (CT) phenotypes implied strong epistasis between or among these alleles. The results shown in Table S4 therefore allowed us to construct putative genetic networks underlying rice CT based on the principle of hierarchy [26] (Figure 2A-D). Figure 2A is a representation of a putative genetic network containing five FGUs detected in the six coldtolerant ILs of the C418/ZH100 population. The donor allele at RM4584 (bin 7.1) was present with an IF of 0.917 in all six cold-tolerant ILs, and was thus placed at the top of the network as the putative regulator. Two highly associated FGUs, AG A1 (bins 1.11, 1.15, 4.12, and 9.6) and RM6835 (bin 7.9) had lower IFs and formed the downstream branch A-I, while two highly associated FGUs, RM5847 (bin 7.12) and AG A2 (bins 6.3 and 6.15) formed the other downstream branch, A-II. FGUs in A-I and those in A-II were independent of one another (Additional file 1: Table S3). The six cold-tolerant ILs from this population could be classified into three major types based on their graphical genotypes at the five FGUs. IL1 and IL5 possessed donor alleles at both A-I and A-II. IL2 and IL3 had donor alleles at branch A-II, and IL4 featured donor alleles at branch A-I (Figure 2A). Interestingly, these ILs did not differ significantly from one another with respect to CT traits ( Table 2), suggesting the functional redundancy of A-I and A-II FGUs with regard to their associations with CT traits. Figure 2B illustrates the putative genetic network containing all six FGUs (four AGs and two loci) detected in the six ILs of the C418/Bg300 population. All six ILs were characterized by donor alleles at AG B1 (bins 2.6, 5.10, and 12.5) with IF = 0.833, which was thus placed at the top of the network as the putative regulator. Five unlinked but highly associated loci in two FGUs, AG B2 (bins 2.4, 2.11, 7.10, and 8.2) and RM247 (bin 12.2), had lower IFs, and formed downstream branch B-I. Similarly, branch B-II consisted of four unlinked but highly associated loci in two FGUs, AG B3 (bins 3.5, 6.3, and 9.9) and RM447 (bin 8.14), which had lower IFs and were thus placed downstream of AG B1 . FGU AG B4 comprised seven unlinked but highly associated loci in bins 1.5, 3.12, 5.15, 6.12, 10.3, 10.9, and 11.14 ( Table 3) that could be placed downstream of either branch B-I or B-II based on its significant associations with AG B2 or AG B3 . The six ILs of this population were classifiable into four types. IL1, IL2, and IL4 had donor alleles at all loci of the six FGUs. IL5 had donor alleles at all loci except for the two single downstream loci (RM247 and RM447). IL3 featured donor alleles at branch B-I FGUs, while IL6 carried donor alleles at branch B-II FGUs. Again, phenotypic values of the CT-related traits did not seem to correlate with the number or type of downstream FGUs in the cold-tolerant ILs ( Table 2). Figure 2C depicts the putative genetic network comprising 15 FGUs (20 loci) detected in the 13 coldtolerant ILs from the C418/CG population. RM5711 (bin 7.2) was placed at the top of the network as the putative regulator because all 13 cold-tolerant ILs had donor alleles at this locus. This network consisted of three major branches. AG C1 (bins 6.15 and 10.9) had very high IFs in the 13 ILs and thus was placed in the upper layer of branch C-I, which was connected with two associated loci located downstream at RM6973 (bin 12.1) and RM3850 (bin 2.18) with slightly lower IFs. Branch C-II had five FGUs, with RM5754 (bin 6.3) having the highest IF in the upper layer and significantly associated with four loci at bins 4.17, 11.10, 3.18, and 4.12 of lower IFs in three downstream sub-branches. Branch  (bins 3.8, 3.16, and 9.1) was significantly associated with RM5754 and RM3126, and thus could be placed downstream of either branch C-II or C-III; a similar situation existed with respect to AG C4 (bins 2.12 and 11.4). The 13 ILs from this population had different allelic combinations at loci in downstream FGUs in the network ( Figure 2C), but most ILs, except for IL1, IL2, and IL3, were similar with regard to measured CT-related traits. Figure 2D corresponds to the genetic network comprising three highly associated FGUs detected in the five cold-tolerant ILs from the C418/MNTH population. All five ILs of this population had donor alleles at two perfectly associated loci at bins 4.3 and 9.9 in AG D1 ; this AG was thus placed at the top of the network as the putative regulator. AG D1 was associated with seven unlinked loci in two other FGUs of lower IFs, including six highly associated downstream loci in AG D2 (bins 6.12, 7.5, 9.3, 10.10, 11.2, and 11.9) and RM5970 (bin 5.12). Again, the five ILs of this population were very similar with respect to the measured CT traits ( Table 2), but they differed greatly in their allelic combinations at the two downstream FGUs.

Discussion
With respect to the genetic basis of complex traits, selection and transgressive segregation have long been central issues in plant breeding and evolution. In this study, we developed 30 ILs that had significantly improved CT compared with their recurrent parent, C418.
Our results indicate that all four indica donors contributed CT-enhancing alleles to the japonica recipient, C418. These results additionally imply the existence of a rich, hidden genetic diversity in the indica gene pool for improving japonica rice CT, as none of the donors were able to survive the same low temperature stress as the tested progeny in a preliminary experiment (unpublished observations). This type of transgressive segregation was observed for almost all complex traits in~90% of the BC breeding populations in our large introgression breeding programs [20][21][22][23][24]. This phenomenon was our primary reason for not selecting parental lines based on their phenotype in any single target traits, as we were aiming to improve multiple complex traits using donors of diverse origins. Thus, our strategy of using introgression breeding, strong phenotypic selection plus genetic tracking and characterization of donor introgression using DNA markers has three main advantages that have been described previously [25]. These advantages are: (1) simultaneous improvement and genetic dissection of target traits under selection, Table 3 Genomic information for 29 functional genetic units (FGUs) (17 single loci and 12 association groups or AGs) for seedling cold tolerance (CT) detected by χ 2 tests (single loci) and multi-locus linkage disequilibrium analyses in 30 cold-tolerant introgression lines (ILs) selected from four BC 2 F 2 populations derived from crosses between the recurrent parent (C418) and four donors (ZH100, Bg300, Cis, and MNTH) (Continued) 3 3 FG is the frequency of functional genotypes (donor homozygote and the heterozygote). 4 Two-sample Z tests to compare the proportions of two groups of BC 2 F 6 ILs derived from the same BC 2 F 2 individual but with significantly different CT phenotypes (SP) was performed using SAS [27]. The first revealed aspect was the expected genome-wide significant allelic frequency shift, or over-introgression, of donor alleles at many loci across the genome in the selected cold-tolerant ILs. In this study, we detected 57 CT loci having an average IF of 0.469, or 3.75 times higher than expected (Table 3). Furthermore, 13 (~32%) of the 57 identified CT loci were simultaneously identified across multiple C418 BC populations having different indica donors ( Figure 1). Fifteen (~37%) additional loci were verified in the segregating BC 2 F 2:6 ILs from the originally selected (See figure on previous page.) Figure 1 Genomic distribution into 48 bins of 57 loci in 29 functional genetic units (FGUs) underlying seedling cold tolerance (CT) in 30 introgression lines (ILs) from four BC 2 F 2 populations of rice. Boxes on the right side of each chromosome are FGUs detected in cold-tolerant ILs; symbols on the left are main-effect QTLs and epistatic QTLs associated with CT previously reported in other rice populations (Table 3; Additional file 1:  Table S4). CT-related genes harbored in or near FGU regions are shown on the left side of each chromosome. Boxes with thicker outlines indicate FGUs verified by comparison of introgression frequencies between two groups of BC 2 F 2:6 ILs with contrasting CT phenotypes derived from the same BC 2 F 2 plants. Colored bins indicate FGUs of high introgression that are very likely to be upstream regulatory genes in the genetic networks (Figure 2A-D) according to molecular quantitative genetics theory [26]. Regions in rectangular boxes represent linked bins detected in multiple populations. In graphical genotypes of each network, unfilled and fully colored cells represent the recipient homozygote and donor functional genotypes (donor homozygote plus the heterozygote), respectively. Numbers in cells of graphic genotypes are the number of loci included in the FGU (1 represents a single locus, while ≥ 2 represents an association group, AG). Here, an AG represents a group of unlinked but perfectly associated loci as shown in Table 3. (Table 3), and 29 (71%) loci were mapped to genomic regions previously reported to harbor major CT QTLs in different rice mapping populations (Figure 1; Table 3). Thus, few, if any, of the identified CT loci were false positives. The high efficiency in detecting CT loci using this selective introgression method is unsurprising, as the 30 cold-tolerant ILs were originally selected from 3,200 individuals in the four BC populations. This result meets the theoretical expectation that the stronger the selection intensity, the greater the amount of target-trait genetic information possessed by the selected progeny [29].

BC 2 F 2 plants
The second and most important aspect of genomewide responses concerns the pronounced non-random associations observed between or among alleles at the detected CT loci, as reflected by the identification of 12 AGs and large numbers of significant LDs between different FGUs in the cold-tolerant ILs of each population. Historically, pronounced non-random associations between or among unlinked isozyme loci have been reported in self-pollinated plant species such as barley and rice, and have been interpreted as the consequence of selection operating on co-adapted gene complexes [30,31]. Unfortunately, genome-wide characterization of non-random associations between or among unlinked loci due to strong directional selection has rarely been carried out in experimental populations. Statistically, strong epistasis is implicated by strong non-random associations between or among alleles at unlinked loci in individuals of extreme phenotypes [26,30]. Indeed, six epistatic QTL pairs conferring CT (bin 3.5-bin 6.3, bin 6.3-bin 9.10, bin 4.12-bin 4.17, bin 4.17-bin 10.10, bin 6.13-bin 9.10, and bin 7.2-bin 11.10) that were previously reported in different rice mapping populations [12,13,16] were identified either as AGs or associated FGUs within the same branches of the CT genetic networks in this study (Figure 2; Additional file 1: Table S4). The presence of these loci in the same AG or in highly associated FGUs within the same branches of the putative genetic networks indicates that the donor alleles at the associated loci were co-responding to the strong selection for CT. These loci were therefore most likely involved in the same CT pathway or in related ones. We realize our strategy has several limitations. First, the small number of selected ILs had low power to detect significant loci and to differentiate IFs among significant FGUs and their hierarchical relationships. Second, owing to possible genetic drift, some of the significant LDs between and among unlinked FGUs may not represent true epistasis. Third, the pairwise LD analyses used in this study had low power to detect multi-locus structures with low and moderate IF when r was greater than 3. A more appropriate multi-locus probability test should take into consideration IF, the number of loci involved, and their non-random associations. Despite these limitations, we note two interesting properties of these putative genetic networks underlying rice CT, which resemble the gene networks of complex signaling pathways controlling abiotic stress tolerances in plants [32] and which have important biological implications.
The first property of interest involves the identification of some important genes controlling rice CT in bins 7.1-7.2, 2.6, 5.10, 12.5 (AG B1 ), 4.3, 9.9 (AG D1 ), 6.15, 10.9 (AG D1 ), 3.2, and 6.3. These loci were inferred as putative regulatory genes controlling rice CT pathways based on the theoretical expectation that regulatory genes respond to selection more strongly than do downstream ones [26] and the presence of donor alleles at these loci in all cold-tolerant ILs from each population (Figure 2). A bioinformatics search of these genomic regions revealed some interesting candidate genes. One of the three loci of AG B1 in bin 12.5 at the top of the C418/Bg300 CTgenetic network ( Figure 2B) has been previously identified as a major CT QTL (qCTS12) and fine-mapped to the short arm of chromosome 12 [1,9] (Figure 1). Transcriptomic analyses of a cold-tolerant progeny, K354 derived from B-IL1 ( Figure 2B) of the C418/ Bg300 population, identified two regulatory genes in bin 12.5 that were highly up-regulated by cold: Sir2 protein (LOC_Os12g07950) and protein phosphatase 2C [31] genes. The former gene is related to epigenetic function, while the latter is involved in stress signaling as inferred by Gene Ontology analysis. Bin 2.6 harbors several possible candidate regulatory genes encoding proteins highly up-regulated by cold in K354: cytochrome P450s, terpene synthase (LOC_Os02g36140 and LOC_Os08g07100), NB-ARC domain-containing protein, histone deacetylase, and transcription factor OsWRKY42. Similarly, a large-effect CT QTL, qCTS4, has been identified and fine-mapped to bin 4.3 (AG D1 ) [1,18]. Bin 6.3 ( Figure 2C) harbors a regulatory gene, OsDREB1D, that functions as one of several DREB transcription factors to enhance cold and salt tolerance in transgenic Arabidopsis [33] (Additional file 1: Table  S5). Furthermore, bins 3.5 and 9.9 of AG B3 in the upper layer of the C418/Bg300 population network harbors four regulatory candidates, the signal transducer OsMAPK5 and three transcription factors (OsiSAP1, OsDREB1A and OsDREB1B), all of which are known to play regulatory roles in rice CT [34][35][36] (Additional file 1: Table S5). The action of AG B3 (bins 3.5, 6.3, and 9.9) as a single FGU is also supported by the strong interactions among these CT QTLs in the same regions in another rice population [12]. Combined results from genetic and transcriptomic analyses thus appear to provide evidence for a role for these putative regulatory genes in control of rice CT. Nevertheless, over-expression and/ or knockout experiments in transgenic plants are required to confirm whether any of these genes actually correspond to the important FGUs identified in the putative rice CT networks.
The second noted property is the fact that every putative genetic network consisted of two to three independent branches, each presumably corresponding to a putative downstream pathway as inferred from epistatic relationships. In addition, the phenotypic similarity in CT-related traits of ILs with different multilocus genotypes at these loci suggests the 'functional redundancy' of these rice CT downstream pathways. An extensive bioinformatics search of the Rice Genome Annotation Project database for genomic regions containing putative downstream CT FGUs identified several interesting positional candidate genes for rice CT. The identified candidates included six genes encoding putative cold-responsive (COR) proteins (COR_1-COR_5 and COR_7) located near three downstream FGUs adjacent to bin 2.18, AG C2 (bins 3.16 and 3.8), and AG B4 (bins 1.5, 5.15, and 6.12). Interestingly, the six downstream loci are all regulated by the rice DREB homolog OsDREB1D harbored in bin 6.3 [37], which was predicted to be the upstream locus of the six downstream loci in the putative CT genetic networks. In addition, genomic regions of AG B4 (bins 1.5, 5.15, and 6.12) in the C418/Bg300 population and two downstream loci near bins 5.12 and 6.12 in the C418/MNTH population harbor three COR genes (COR_3, COR_4, and COR_7) and one CT-related transcription factor (AP37), which are predicted to be regulated by one CT-related transcription factor, OsiSAP1, and two DREB homologs (OsDREB1A and OsDREB1B) near bin 9.9. Most of the above-mentioned CT candidate genes were found to be strongly up-regulated by cold in K354 derived from B-IL1 [38]. While individual characterization of the molecular functions of detected CT loci and putative pathways using the cold-tolerant ILs would be very challenging, genetic confirmation of putative CT genetic networks is not difficult. A straightforward strategy is to break each identified network or AG into individual loci and to evaluate their effects individually in progeny derived from crosses between selected ILs or between selected ILs and recurrent parent C418.
Evolutionally, the above-described genetic system underlying complex traits is expected maintain high levels of genetic variation even in populations under strong directional selection. For example, although all 30 cold-tolerant ILs selected in this study reached fixation after two rounds of selection for CT followed by two additional generations of selfing, significant amounts of genetic variation existed at all downstream loci in cold-tolerant ILs from each population. The same situation was observed for putative upstream regulatory loci when ILs from different populations were considered. This type of genetic system, with multiple pathways affecting the same traits, obviously can be (and indeed may have been) responsible for the maintenance of the tremendous genetic variation at loci for complex traits in plant and animal species, even under long-term directional selection [39,40].
Our results have important implications for improving complex traits in rice. Our data indicate that genetic complementarity and repulsive distribution of functional alleles in the parents provides an appropriate explanation for the hidden diversity and transgressive segregation of CT, observed in this study, and other complex traits in rice [20][21][22][23][24]. Examination of graphic genotypes of the cold-tolerant ILs (Figure 2A-D) revealed that each IL had the functional (donor) alleles of at least one upstream locus plus one or more downstream FGUs. This observation implies that the improved CT of the C418 ILs was apparently achieved by complementation of one or more broken pathways from introgressed indica alleles. The japonica rice gene pool is known to have very limited genetic diversity [5,31]. In addition, past rice breeding of both indica and japonica rice has focused on exploitation of within-subspecies diversity, as hybrid sterility and hybrid breakdown are present in populations derived from inter-subspecific crosses [41]. This general tendency of narrowing the genetic basis of the breeding parents followed by continued selection for the same suite of target traits in specific breeding populations would be expected to result in the fixation of functional alleles, with fewer pathways affecting target traits in progeny selected from the same population and, conversely, more differentiation observed in progeny selected from different populations ( Figure 2). Tremendous efforts should therefore be taken to broaden the genetic diversity of the elite rice gene pool. This goal may be accomplished by exploiting hidden diversity from distantly related parents or exotic germplasm, such as landraces and accessions of different subspecies, which are more likely to have complementary functional alleles for the broken pathways affecting complex traits in elite rice varieties [42].

Conclusions
Strong phenotypic selection for rice seedling CT in four rice BC populations resulted in the development of 30 cold-tolerant ILs. Genetic tracking and characterization of donor introgression in the 30 selected ILs using DNA markers revealed two important aspects of genome-wide responses to selection: (1) significant over-introgression of donor alleles at 57 loci across the rice genome, and (2) pronounced non-random associations resulting from strong epistasis between or among many alleles at unlinked loci. LD analyses of identified CT loci allowed us to infer putative genetic networks or multi-locus structures underlying rice seedling CT. Our results indicate that genetic complementarity and repulsive distribution of functional alleles in the parents provides an appropriate explanation for the hidden diversity and transgressive segregation of CT observed in this study and other complex traits in rice. Our results suggest that complex traits such as CT in rice are controlled by multiple pathways, each involving large numbers of loci, which are able to maintain high levels of genetic variation at loci affecting complex traits even under long-term directional selection.

Additional files
Additional file 1: Table S1. The list of 298 polymorphic SSR markers in 137 bins across the rice genome used to genotype the introgression lines from the four BC populations. Additional file 1: Table S2. Detailed information on linkage disequilibrium (LD) analysis of the perfect association groups (AGs) for seedling cold tolerance detected in ILs selected from four BC 2 F 2 populations between C418 and four different indica donors. Additional file 1: Table S3. Summarized results of 108 pairwise LD analyses between identified FGUs, either individual loci of excess introgression or perfect AGs, for seedling cold tolerance (CT) detected in 30 cold-tolerant ILs from four BC 2 F 2 between C418 and four different donors. Additional file 1: Table S4. Previously reported CT epistatic QTL intervals consistent with the relationships among FGUs of putative CT genetic networks detected in this study. Additional file 1: Table S5. Signal transducers, transcription factors, and responsive proteins related to CT near or within regions of the FGUs detected in this study.
Additional file 2: Figure S1. Seedling cold-tolerance phenotypes of introgression lines derived from four recurrent parent C418 and four