Screening of segregating F2 progenies and validation of DNA markers through bulk segregant analysis for phosphorous deficiency tolerance in rice

Phosphorous deficiency (PD) tolerance is a polygenic trait. The underlying genetics of PD tolerance trait is important to provide the basis for detecting Quantitative Trait Loci (QTLs) and validating markers that could be used in Marker Assisted Breeding (MAB) in rice. The PD tolerance of Sri Lankan rice germplasm has been characterized. However, no attempts were taken to develop and validate the DNA markers for the breeding purposes and to understand the genetic basis of the traits. The present research project was conducted to assess the PD related traits and to validate internationally published DNA markers that are linked to PD tolerance using Sri Lankan rice cultivars. A total of 84 crosses were made and advanced to F2 and higher generations. Out of these crosses, an important subset of three crosses was selected based on the overall PD tolerance and sensitivity, importance as mega production varieties and pedigree connections between the cultivars. The plant height, number of tillers, shoot dry weight, leaf width, flag leaf width and the color metrics L*, a*, b*, hue (h*) and chroma (C*) were measured from 200 individuals each from the three populations grown under P deficient (Po) soil conditions. Except color traits, other traits were normally distributed and exhibited higher broad sensitivity. The color metrics indicate the presence of possible epistatic interactions between the major underlying loci. From each population, two extreme bulks were selected from the highest and lowest ends of shoot dry weight (SDW) for bulk segregant analyses (BSA) to validate the DNA markers for PD tolerance. It was observed that, DNA marker K46-K1 can be used for MAB of rice for PD tolerance. The genetic information generated in the present study can also be used for larger scale genomic studies such as SNPs, GBS and GWAS mapping.


INTRODUCTION
Lack of required levels of phosphorous (P) in the soil is a major drawback for profitable rice farming (Marschner and Marschner, 2012;Nielsen et al., 2001). Application of P fertilizer is expensive and can cause major setbacks such as environmental pollution and health hazards (Bennett et al., 2001;Cordell et al., 2009;Reddy et al., 1999). The production of P deficiency (PD) tolerant rice varieties through marker assisted breeding (MAB) is regarded as the most logical and cost effective solution to answer this problem (Collard et al., 2005). Although MAB for PD tolerance in rice is widely studied in other countries (Ni et al., 1998;Wissuwa et al., 1998;Wissuwa, 2005), it has not gained too much attention in Sri Lanka. Recently, a set of rice landraces and varieties were screened for PD tolerance in Sri Lanka (Aluwihare et al., 2016). However, there are no attempts being made to validate the DNA markers that are linked to PD tolerance for Sri Lankan rice germplasm. When employing DNA markers in MAB, it requires the careful validation through genetic analysis (Xu and Crouch, 2008). The major QTL conferring PD tolerance in rice, Pup1, has been identified (Wissuwa and Ae, 2001a;Wissuwa et al., 2002), fine mapped, bioinformatically and genomically characterized (Gamuyao et al., 2012;Heuer et al., 2009) and Pup1 linked DNA markers were developed (Chin et al., 2010;Chin et al., 2011). In order to validate these markers for a countryspecific rice germplasm, bulk segregant analysis (BSA), a short cut procedure to validate DNA markers through phenotypically distinct sets of individuals in segregating populations can be employed (Venuprasad et al., 2009). In addition, the underlying DNA sequence variations of the marker loci could be used to detect the association between DNA polymorphisms with the traits such as PD tolerance in rice. Therefore, the present study was conducted to validate the Pup1 linked DNA markers using BSA of F 2 populations segregating for PD tolerance.

Plant material
A total of 12 landraces and 18 rice varieties were screened for PD soil conditions and subjected to three-tier indexation of 3: tolerant; 2: moderately tolerant and 1: sensitive (Aluwihare et al., 2016). Approximately 84 crosses were made between tolerant and moderately tolerant / sensitive rice genotypes using clipping and hot water dipping methods (Tong and Yoshida, 2008). A total of three crosses namely H-4 × Bg 357, Murungakayan × Bg 357 and Marss × Bg 357 were selected for the present study based on the degree of tolerance and importance to breeding. The collected F 1 seeds were planted at Rice Research and Development Institute (RRDI), Bathalagoda, Sri Lanka and F 1 plants were prudently examined to remove any off types occurred due to selfing. The F 2 seeds were collected from F 1 plants and 200 seeds (i.e. individuals) from each progeny were planted in a greenhouse at the University of Peradeniya, Sri Lanka. Ultisol soil which was collected from a field at RRDI was used as the growth medium where the field has been maintained without addition of any fertilizer for the last 40 years. The ulitsol soil was characterized for very low concentration of P (1 mg of P in 1 Kg of soil) and other nutrients (Kumaragamage and Indraratne, 2011;Sirisena and Wanninayake 2014). The standard fertilizer dressings (without P) and other management practices were applied based on the guidelines of Department of Agriculture (DOA) (Department of Agriculture, 2006), Sri Lanka.

Collection of trait data related to PD tolerance
Plant height (PlH), number of tillers (NT), shoot dry weight (SDW), leaf width (LW) and width of the middle region of the flag leaf (FLW) were measured at the early flowering stage immediately after the onset of first panicle of the plant.
The color variation occurred in leaves due to PD conditions were captured in four replicates using the color metrics L*, a*, b*, chroma (C*) and hue angle (h*) employing a spectrophotometer (CR-10, Konika Minolta, Tokyo, Japan). The L*, a* and b* indicate black / white, green / red and blue / yellow respectively. The C* and h*, are calculated based on a* and b*, indicating the overall color and sharpness of the color respectively (Figure1). (modified from the images available at www.colorcodehex.com and www.rodsmith.org.uk).
Tick marks indicate that two parents have polymorphic loci for the particulate DNA marker  Table 1. A total of eight individuals, each having highest and the lowest SDW, were selected from each F 2 population for the BSA ( Table 2). The immature leaf samples were used to extract DNA using Dneasy® plant mini kit (Qiagen, Solna, Sweden). The trnH-psbA, a standard and universal plant DNA barcoding primer pair (Hollingsworth et al., 2011), was used to validate the quality and rightness of DNA for PCR amplification. The DNA from each eight individuals were bulked and used as the template DNA in PCR. For each F 2 population, DNA of two parents, DNA from two bulks (highest and lowest SDW groups) and DNA from the individuals from each bulk (i.e. a total of 16 individuals) were used for PCR using the DNA markers given in Table 1.The PCR conditions were provided using a Thermal Cycler (Takara, Japan) as follows; initial denaturation at 94 °C for 5 min, then 35 cycles of 94 °C for 30 sec, primer annealing temperature (Ta) ( Table 1) for 90 sec, and 72 °C for 2 min, final extension at 72 °C for 10 min.The amplified PCR products were size separated using 1.5 % ethidium bromide stained agarose gel electrophoresis.

Statistical analysis of PD tolerance data
All the tested parameters were used to calculate the Pearson's Correlation Coefficients (PCC) using the statistical package Minitab 16 (Minitab Inc., USA). Kolmogorov-Smirnov (KS) Coefficient, skewness and kurtosis were calculated to test the normality of trait distributions. The data distributions were plotted as histograms and parental values were marked to understand the data range and the presence of transgressive segregants.

Correlation among tested parameters
The PlH was not significantly correlated with NT. However, PlH and NT were significantly correlated with SDW separately (PCC of 18 % -46 %). In H-4 × Bg 357 population, NT was significantly correlated with LW but in other two populations, LW was correlated with PlH. In H-4 × Bg 357 and Marss× Bg 357 populations, FLW was significantly correlated with PlH and LW. The color parameters L* and a* were strongly and negatively correlated with a* (PCC of -60 % to -81 %). The color parameter b* was also correlated with L* and a* separately. The integrative color metrics, C* and h*, were significantly correlated with L*, a* and b* except in H-4 × Bg 357 and Marss × Bg 357 populations, where b* was not correlated with C* and h* (Table 3) (P<0.05).

The nature of the trait distribution
The parameters PlH, NT, LW and FLW were normally distributed in all three populations. The SDW was not normally distributed in Murungakayan × Bg 357 and Marss × Bg 357. All five color parameters were not significantly and normally distributed in all three populations. The nature of the distribution and presence of transgressive segregants with respect to the parental trait values are shown in Figure 2, 3, 4 and 5.

Heritability and heterosis
The BSH was higher for all the traits in all three populations ranging from 34 % to 98 %. Heterosis(H) was present in all traits. The detailed,H and BSH values are given in Table 4.

Epistatic nature of the inheritance in color metrics
A goodness of fit analysis for the epistatic ratios revealed that the color metrics b* and h* in H-4 × Bg 357 population, a*, b* and h* in Murungakayan × Bg 357 population and L* and h* in Marss × Bg 357 population were fitting into the epistatic ratio of 12:3:1. The color metrics b* and h* in Murungakayan × Bg 357, b* and a* in Marss × Bg 357 were fitting into 10:3:3 ratio. The 9:6:1 ratio was not observed in any of the populations and the 9:3:4 ratio was matched with the color metric b* in Marss× Bg 357 (P <0.05) ( Table 5).

The maker validation through bulk segregant analysis
The SDW based BSA did not provide tolerant or sensitive PD class specific marker alleles exclusively( Figure 6 and Table 6). However, K46-K1 marker allele was more specific to tolerant group indicating its applicability in MAB. *P <0.05 (Expected χ 2 value: 5.99), **P <0.01 (Expected χ 2 value: 9.21), ***P <0.001 (Expected χ 2 value: 13.82) Figure 6: The composite gel image for 13 Pup1 linked DNA markers subjected to bulk segregant analysis using the selected bulks based on higher and lower shoot dry weights in three F 2 populations. The F 2 populations were screened under PD conditions. Marker names are indicated at the right and population names are indicated at the left. The top row represents the sample labels. TP: PD tolerant parent, SP: PD sensitive parent; TB: PD tolerant bulk (consist the mixture of DNA from T1 to T8); SB: PD sensitive bulk (consist the mixture of DNA from S1 to S8). T: Tolerant; S: Sensitive. The DNA barcoding primer pair trnH-psbA was used to confirm the quality of DNA for PCR.

DISCUSSION
The dissection of the genetic basis for PD tolerance is not easy as it is a polygenic trait. However, the heritability associated with the traits of PD tolerance is found to be higher, enabling the possible detection of underlying QTLs (Majumder et al., 1989). Identification of the nature of inheritance in traits such as PD tolerance can be done by using the phenotypic variations in segregating populations. In the present study, more than 80 F 1 crosses were made between the PD tolerant and PD sensitive rice cultivars reported in Aluwihare et al., (2016). These crosses are currently being advanced towards ~F 10 with selections to produce Recombinant Inbred Lines (RILs) in the breeding programs conducted by RRDI, Sri Lanka. For the present study, three crosses namely H-4× Bg 357, Murungakayan × Bg 357 and Marss × Bg 357 were selected for the analysis. The selection of these crosses was mainly based on few factors. To understand the genetics of PD tolerance logically, one parent was kept as a common parent for all three crosses so that meaningful comparisons can be made across the rice cultivars. The improved rice variety Bg 357 was preferred as it is one of the most popular mega rice varieties in Sri Lanka. Parent H-4 was selected because it is the highest PD tolerant rice variety according to Aluwihare et al., (2016). The landraces Murungakayan and Marss were preferred because they are PD tolerant (Aluwihare et al., 2016) but more importantly, the parents of H-4. The selection of multiple segregating populations with shared parents is a common practice in many genetic and breeding programs (Brim, 1966;Brown and Caligari, 2011;Kharkwal and Sharma, 2002;Suneson, 1956) and provide detailed comparisons plus the basis for association mapping such as GWAS (Bush and Moore, 2012), if resources are available.
The distribution of the phenotypic data, Skewness and Kurtosis parameters indicated that except the color metrics, other traits are mostly normally distributed except the SDW in two populations (Figure 2, 3, 4 and 5). The normal distribution provides the rationale for correct adoptions of mapping algorithms in future experiments. As Majumder et al., (1989) observed all tested parameters exhibit very high BSH indicating PD has a considerable proportion of genetic variation in all three populations.
The H estimates are highly variable for the traits indicating the complexity of the inheritance of the PD tolerance and it is quite premature to conclude that there is a significant dominance effect for the PD tolerance in three populations. However, transgressive segregants were observed in all traits in all three populations which could possibly be due to some form of H.
The color metrics L*, a*, b*, C* and h*, are important as the leaf color variation due to PD tolerance is not very distinct and quantitatively distributed so that a continuous scale is required to capture the variation. There are many gene mapping studies which have used these color metrics for mapping QTLs (Espley et al., 2007;Sooriyapathirana et al., 2010;Uematsu et al., 2014). All the color metrics exhibit nonnormal distributions strongly, indicating the presence of major genes with intergenicepistatic interactions. As KS coefficient indicates, the distributions of the color metrics are strongly and significantly deviated from the normality. When the color metric data are categorized into three distinct groups for each population and subjected to chi-square value based goodness of fit analyses, some significant epistatic interactions were observed (P<0.001). The color metric L* is following 9:6:1 ratio for Murungakayan and Bg 357 populations. The color metric b* in Marss× Bg 357 population is following 10:3:3 epistatic ratio and color metric L* in H-4 × Bg 357 and color metric h* in Marss × Bg 357 population are following 12:3:1 interactions. The other color traits were not significant for the epistatic ratios possibly because only 200 individuals were screened for each population. However, this goodness of fit analyses provide the basis that, there could be some major QTLs conferring the leaf color change due to P starvation which includes complex effects modified by the other QTLs and the environment.
The marker validation and QTL mapping using the populations in which each is having 200 individuals are expensive and tedious to undertake (Xu and Crouch, 2008). The BSA has been introduced as the shortcut method to quickly validate the DNA markers with reduced inputs (Venuprasad et al., 2009). However, in the present study it was difficult to identify the distinct groups or traits as the bulks for BSA. Each trait is uniquely distributed and provided specific tail groups making an integrated approach of selecting bulk impossible. After conducting so many rounds of iterations to select bulks, SDW was considered as the single parameter to select the bulks. The selection of SDW as the bulk defining criterion would be logical as it is highly and significantly associated with the PD tolerance in rice (Wissuwa et al., 1998;Aluwihare et al., 2016;Wissuwa and Ae, 2001b). From each population, the eight individuals who got the highest SDW and the eight individuals having the lowest SDW were selected as the PD tolerant and sensitive bulks separately ( Table 2). The selection of bulks from the trait distribution curves is very common in genetics (Harris, 1911) and especially in latest core genome based gene expression and SNP array analyses (Kearsey and Farquhar, 1998;Varshney et al., 2009). The BSA analysis revealed no tolerant or sensitive bands exclusively indicating the higher variability in Pup1 QTL region. As it was clearly reviewed in Chin et al., (2011), the phenotyping for Pup1 QTL is confusing and SDW cannot be used as the sole criterion for BSA. The higher transposon activity (Heuer et al., 2009) might complicate the genomic architecture that would make BSA quite cumbersome.

CONCLUSION
The screening of F 2 rice progenies H-4× Bg 357, Murungakayan × Bg 357 and Marss × Bg 357 segregating for PD tolerance indicates the complex nature of inheritance and higher heritability estimates. The color variation due to P starvation is not normally distributed and having epistatic gene interactions. The BSA using the tolerant and sensitive bulks selected from the three populations have validated K46-1 that can be used for MAB for PD tolerance in rice.