Raising crop yields: The missing links from molecular biology to plant breeding

This article is published under the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract: Rapid advances in molecular biology and functional genomics during the last two decades have created considerable optimism about their potential to increase global food production by raising the yields of major agricultural crops via breeding of new crop varieties. However, yield increases achieved so far have come largely through conventional plant breeding. This brief review outlines reasons why genetic transformations at the subcellular level in single plants grown in controlled environments may not necessarily translate in to yield increases of plant populations in the field. Expected advantages of sub-cellular molecular interventions are either dampened or reversed at the plant population level when crop yields are determined. This could happen due to complex interactions, feedbacks and tradeoffs across different levels of plant organization at both spatialand temporal scales, dependence on specific environmental requirements for expression of introduced genes, unrealistic experimental conditions used to generate trascriptomes and mis-matches between strategies required to achieve superior performance in single plants and plant populations. It is argued that an understanding of the physiological basis of yield determination at the plant population/community level, a tight linkage of genetic modifications to specific yield components and a thorough evaluation of feedbacks and trade-offs during upscaling across different levels of plant organization is needed to harness the undoubted potential of molecular genetics towards crop yield improvement.


THE CHALLENGE OF INCREASING CROP YIELDS AND THE POTENTIAL OF MOLECULAR BIOLOGY
The necessity to increase the yields of agricultural crops has never been felt more acutely than in recent times because of a combination of several factors. These include rising human population, decreasing land availability and soil fertility, diminishing water resources and increasing frequency of environmental stresses (Godfray et al., 2010;The Royal Society, 2009).Climate change triggered by increasing concentration of greenhouse gases has contributed significantly to increasing the frequency and intensity of environmental stresses (e.g. drought, heat, salinity, flooding etc.), thus exerting enormous pressure on global food production systems (Beddington et al., 2012;Wheeler and von Braun, 2013). During most of its recent history, mankind has been able to keep its food production in pace with the growing demand (Evans, 1998) through innovations and technological advances. These include synthetic inorganic fertilizers, new crop varieties resistant to key pests and diseases and substantial increases in potential yield (i.e. yield under optimum environmental and management conditions) via modification of canopy architecture and increased harvest index (Evans, 1993). Similar to the technological advances in the past, advances in Plant Molecular Biology and its associated disciplines (e.g. Functional Genomics, Bioinformatics, Proteomics, Metabolomics, Computational Biology, Systems Biology etc.) during the last two decades have opened an enormous vista of new possibilities to bring about yield increases in major agricultural crops (Fedoroff et al., 2010;Tester and Langridge, 2010). However, according to Reynolds and Langridge (2016), the current genetic increase of 0.5-1% per annum in cereal crop yields is being achieved almost entirely via conventional plant breeding., Conventional plant breeding onlyinvolves crossing of different parental genotypes, morphological evaluation of progeny, selection of individuals based on a few key traits (e.g. adequate tolerance to a specific pest/diseases/stress factor at the early stages and desirable morphological traits such as grain colour and shape and yield at the latter stages of breeding programme) and their purification and advancement through successive generations (Fehr, 1987;Allard, 1999;Moose and Mumm, 2008). The so-called 'Green Revolution', as initiated by Norman Borlaug in the wheat fields of Mexico in 1944, to develop higher yielding varieties of wheat, maize and rice in the two decades that followed was also grounded firmly in the above procedures of conventional plant breeding, which Borlaug himself described as 'a tedious and time-consuming, hit-or-miss process with no guarantee of success' (Hessor, 2009). The substantial yield increases achieved during the 1960s and 70s started plateauing from the mid-1990s onwards (Lobell et al., 2009;McCouch et al., 2013) with only incremental gains being achieved since then (Fischer and Edmeades, 2010). Hence, the need to break these 'yield ceilings', especially in the principal food crops, has been a foremost challenge confronting agricultural scientists, both locally and globally (Khush, 1995;Cassman et al.,2003;Peng et al., 2008). In this scenario, the advances in Plant Molecular Biology, especially in Functional Genomics, to identify specific genes and their functions in various plant processes have brought renewed optimism for achieving yield increases required for the global food production to be ahead of its rapidly-increasing demand (Minorsky, 2003;Moose and Mumm, 2008;Furbank and Tester, 2011). Especially, microarrays and 'high-throughput' techniques that allow screening and genetic characterization of large numbers of phenotypes and identification of large numbers of genes have generated a large volume of information on the signalling pathways and control mechanisms of various plant processes at the cellular and sub-cellular levels (Araus and Cairns, 2014). The underlying assumption has been that improved understanding of crop processes at the molecular level would lead to development of new crop genotypes having greater yield and/or other desired traits. These other desired traits may be greater abiotic stress tolerance, greater pest and disease resistance, increased nutritive value or increased use efficiency of resources such as water and nutrients. The purpose of this review paper is to bring in to focus some of the essential pre-requisites and difficulties that need to be overcome to realize the undoubted potential of recent advances in Molecular Biology, functional genomics and biotechnological tools based on advances in molecular biology to bring about increases in crop yields.

YIELD DETERMINATION OF CROPS: A COMPLEX INTEGRATION FROM MOLECULAR TO PLANT POPULATION LEVELS
The yield of a crop growing in the field (often measured as the biomass of its economically-important harvested part) is the net result of a series of internal processes occurring across different levels of plant organization (i.e. at sub-cellular-, cellular-, tissue-, organ-, whole plant-and plant population/community levels) and their interaction with the external growing environment. From a Genetics and Plant Breeding perspective, crop yield is a complex polygenic trait determined by a large number of genes and with relatively low inheritance (Ludlow and Muchow, 1990;Collard and Mackill, 2008;Passioura, 2012) where the influence of each individual gene is, generally, small. Furthermore, environmental variability encountered by crops growing in the field causes variations in the expression of different genes (Yin and Struik, 2008;Gu et al., 2014), thus increasing the season-to-season yield variability of a given genotype and decreasing its heritability. Hence, the introduction of specific genes with known functions at improving a specific step of a process or a process as a whole does not necessary guarantee a proportional increase in potential yield or greater tolerance from specific abiotic or biotic stresses or increased resilience to climate change. Key reasons why genetic transformations at the sub-cellular level may not always translate in to crop yield increases at the plant population level are discussed below.

Functional balances, feedbacks and trade-offs
The complex network of internal processes operating across different levels of plant organization means that alteration of one specific process/pathway (or one or few specific steps of it) does not often improve yield at the whole plant level (Hammer et al., 2004;Passioura, 2010). Yield performance of a whole plant is highly-regulated via functional balances, feedbacks and trade-offs between different organs, tissues and processes. Several examples from a range of diverse processes, are illustrative of these functional balances, feedbacks and trade-offs that occur during yield determination. For example, induction of deeper rooting to increase drought tolerance could occur at the expense of shoot growth and photosynthetic performance, thus reducing the expected yield advantage under drought.
Photosynthesis, a primary process to contributing crop yield determination, demonstrates clear instances of feedbacks and trade-offs across different hierarchical levels of plant organization. At the sub-cellular level, increase of Rubisco-specificity for carbon dioxide (to reduce photorespiration) results in reduced catalytic rate (Long et al., 2006a), thus negating the expected increases in net photosynthetic rate via increases in the rates of carboxylation and Calvin cycle activity. Evans (1993) reported a wealth of experimental evidence showing that superior photosynthetic capacity, as measured by lightsaturated photosynthetic rate, is poorly correlated with higher crop yield because greater metabolic investment in photosynthetic capacity could decrease total leaf area of the crop. A theoretical analysis by Sinclair et al. (2004) shows that, a 50% increase in mRNA transcripts for Rubisco synthesis is translated in to only a 6% increase in yield,when increased nitrogen for additional plant tissue is provided, while incurring a 6% yield decrease without additional nitrogen.While the increase in mRNA transcripts will increase Rubisco concentration on a per unit leaf area basis, maximum utilization of additional Rubisco requires saturating light intensities for all leaves in a crop canopy. Such a situation does not exist when crops are grown as plant populations in which only a fraction of the foliage canopy receives saturating light intensities, while the rest is mutually-shaded to varying degrees (Long et al., 2006a). Furthermore, a substantial increase in nitrogen supply would be required to translate increases in Rubisco concentration to yield increases. Firstly, more nitrogen is required for synthesizing additional Rubisco. Secondly, to achieve greater biomass via increased Rubisco-induced photosynthesis, additional nitrogen is required for building up structural plant tissue. Thirdly, nitrogen is also required for grain filling. Therefore, as shown by Sinclair et al. (2004) without increasing the absorption and storage capacity of nitrogen, a mere genetic transformation to increase Rubisco concentration could even result in a reduction of yield due to the feedbacks and trade-offs between different processes from sub-cellular to plant population levels.
Similarly, increased activity of key enzymes of the C 4 photosynthetic pathway (e.g. PEP-carboxylase, pyruvate orthophosphate dikinase) via genetic transformation has not W.A.J.M. De Costa resulted in clear and consistent increases in photosynthetic rates in C 3 crops (Matsuoka et al., 2001). Enhancing the activities of C 4 enzymes has been a key step in engineering the C 4 pathway in rice (Sheehy et al., 2000;von Caemmerer et al., 2012). The absence of clear and consistent increases in photosynthesis following enhanced activity of C 4 enzymes showed that further changes in leaf anatomy and biochemistry are needed to successfully engineer the C 4 pathway in rice, a C 3 crop (Brown, 1994;Evans and von Caemmerer, 2000;Yin and Struik, 2008;von Caemmerer et al., 2012). Ku et al. (2000) reported up to 35% increases in single leaf photosynthetic rates in rice and 10-30% greater grain yields of single plants transformed to overexpress PEP-carboxylase and pyruvate, orthophosphate dikinase. However, Yin and Struik (2008) showed that canopy photosynthetic rates of transgenic rice plants increased above the wild-type only during the vegetative phase while the opposite occurred during the grain-filling phase. This is because increased carbon: nitrogen ratio in the leaves and increased nitrogen requirement during grain-filling induced greater leaf senescence due to limitation of nitrogen (Yin et al., 2000).
Another striking example of how increased rate of a process at the cellular level is dampened down at the crop level comes from CO 2 enrichment studies. Even though a theoretical calculation based on the biochemistry of photosynthesis at the cellular level predicted a 36% increase in light-saturated leaf photosynthetic rates in a typical C 3 crop in response to an increase of atmospheric CO 2 concentration up to 550 ppm (projected to occur around the middle of the 21 st Century), a compilation of results from Free Air CO 2 Enrichment (FACE) experiments across a range of C 3 crops showed that total biomass and yield increased by only 17% and 13% respectively (Long et al., 2006b). This reduction of the cellular level impact at the crop level was attributed to a series of feedbacks operating at the whole plant and field-scale.

Scaling-up across processes occurring at different temporal scales
In addition to the spatial scaling up from sub-cellular to plant population levels, an understanding of the events, processes, controls and feedbacks operating at different temporal scales ranging from seconds (e.g. light capture, stomatal movement etc.), minutes (water absorption, xylem flow), hours (leaf expansion, root extension), days (leaf and root initiation), weeks (biomass accumulation) and months (yield) is required in manipulating crop processes using molecular biological methods for improved crop performance (Passioura, 1982(Passioura, & 1996Sinclair et al., 2004;Hammer et al., 2004;Yin and Struik, 2008;Passioura, 2010;Zhu et al., 2010;Yin and Struik, 2015;Reynolds and Langridge, 2016). Along with processes occurring at different time scales, environmental variations occurring at different time scales should also be taken in to account. For example, whereas most studies exploring the molecular aspects of abiotic stress tolerance have subjected the test plants to rapid stresses of high intensity (e.g. withholding water to plants growing in pots, sudden exposure to high levels of salinity or high temperatures), field-grown plants in natural environments experience a much more gradual stress development. Accordingly, plant responses even at the sub-cellular level (e.g. transcriptomic, proteomic and metabolomics changes) to such rapidly-imposed stresses would be different from those to gradual stress induction (Munns, 2005;Munns and Tester, 2008;Passioura, 2012).

Environmental requirements at specific phenological stages controlling specific gene expression
Crop growth and yield is always determined by the complex interaction between its genetic make-up and its highly-variable growing environment. The expression of a substantial portion of yield-determining genes is linked to specific environmental conditions being present at specific phenological stages of the crop. For example, expression or suppression of photoperiodism in certain crops (e.g. rice, sugarcane) requires the day length being within a specific range at the photoperiod-sensitive phase of its life cycle (Yoshida, 1981;Yin et al., 1997). Furthermore, contribution of certain genes to improved crop performance, especially under abiotic stresses, is dependent upon the timing of occurrence of the stress episode in relation to the crop's life cycle. For example, improved crop performance under early-season drought occurring at the vegetative phase of a crop requires a specific set of tolerance mechanisms, traits and genes which would be different from those required for improved performance under drought occurring at the reproductive phase (Passioura, 2007;Passioura, 2012;Tardieu et al., 2014). Hence, induction of greater stress tolerance requires not only the introduction of specific genes but also the control of their expression at the required phenological stages (Tardieu, 2012).
The intimate link between a plant's growing environment and its performance, especially under abiotic stresses, necessitates replication of actual field environmental conditions in the plant houses, growth chambers or laboratories, where potted plants subjected to gene expression and metabolic profiling, are mostly grown (Passioura, 2006;Poorter et al., 2012). Eventhough Sinclair et al. (2004) and Delannay et al. (2012) have listed a few success stories, Mittler and Blumwald (2010) contends that despite an enormous research effort in genomics, transcriptomics, proteomics and metabolomics, stress tolerance in the field has been demonstrated for only very few genes identified as conferring tolerance under artificial environments.

Inter-plant competition in a crop growing in the field
Increasing the yield of a single plant grown in isolation does not necessarily increase the yield of a plant population (or a community) when it is grown in the field as an agricultural crop. Inter-plant competition even within an agricultural crop (a population of plants belonging to the same species) means that traits that contribute to greater performance of an isolated plant may not be the same as those that enable a plant population to maximize its overall performance (i.e. the yield of a crop per unit land area). Interestingly, Donald (1968) in his landmark paper on 'Crop Ideotypes', an ideal plant having a range of desirable traits to achieve a high yield performance, listed being a 'weak competitor' within its own population as one such trait. While a plant with a weak competitive ability in a natural plant community would perform poorly, a population of weak competitors in a managed environment of an agricultural crop would help maximize the overall performance by allowing its neighbours access to adequate resources. For example, the ability to use water conservatively in a water-limited situation by having traits such as sensitive stomata which close earlier during a stress episode may be useful for a plant growing isolation. However, in a plant population, the same trait would leave the conserved water either for the use of competing neighbouring plants or leave it unused in the soil profile until the end of the season, thereby losing the opportunity to produce more yield (Passioura, 1982). Therefore, ability to maintain a higher (rather than lower) stomatal conductance during episodes of water stress is often associated with superior yield performance under drought (Araus et al., 2002;Blum, 2005;Blum, 2009;Tardieu, 2012).
In parallel to the inter-plant competition in a plant population, even within a single plant, competition for assimilates and acquired nutrients (e.g. nitrogen) occurs between different organs and processes. Accordingly, both inter-and intra-plant competition would mean that the impacts on crop yield of genetic transformations at the sub-cellular levels are likely to be dampened down, at the whole plant and plant population levels.

DISSECTION OF CROP YIELD IN TO YIELD COMPONENTS: AN APPROACH BASED ON AN UNDERSTANDING OF YIELD COMPONENTS
In view of the complex nature of crop yield and the interacting network of processes that determine it, an understanding of the basic physiological principles of crop yield determination could better harness the vast potential of functional genomics towards improving the potential yield and stress tolerance of agricultural crops in the field (De Costa, 2004). Yield components can be given in terms of a few simple equations as following: Yield per unit land area = Number of plants per unit land area × No. of harvestable organs initiated per plant × Fraction of initiated harvestable organs developed and filled × Mean weight of a filled harvestable organ -------(1) When harvestable organs (e.g. grains) are aggregated in to special structures such as panicles (in rice), ears (wheat), cobs (maize) or pods (soy bean), the second yield component can be further divided as: No. of harvestable organs initiated per plant = No. of aggregate structures per plant × No. of harvestable organs initiated per aggregate structure ------(2) Equations 1 and 2 show that any yield improvement to be brought about by genetic modification or transformation at the sub-cellular level has to occur via improvement of atleast one of the listed yield components.Employing this rationale when collecting information on the functioning of sequenced genes could improve the efficiency of efforts to bring about yield improvement via genetic transformation. This requires linking of information on functional genomics to specific mechanisms and pathways which contribute to increasing one or more yield components.
When attempting to bring about yield improvement in resource-limited and/or stressed environments, simple relationships which dissect crop yield into mechanistic components (instead of morphological components as shown in eqs. 1 and 2) would bring about greater focus in to efforts at increasing yield performance under resourcelimited conditions (i.e. stress tolerance).
Yield per unit land area = Amount of limiting resource (i.e. water) captured by the crop per unit land area × Amount of biomass produced per unit of limiting resource used × Fraction of biomass partitioned to the harvestable organs ------- (3) Accordingly, crop yield in a water-limited situation, which is happening with increasing frequency in many parts of the world, can be given as: Water-limited yield per unit land area = Crop water use (i.e. evapotranspiration) per unit land area × Water use efficiency (i.e. biomass per unit of water used) × Harvest index -----(4) When attempting to improve drought tolerance (i.e. yield in a water-limited situation) via molecular methods, it is worth noting that drought tolerance in an agricultural crop (as opposed to drought survival in a natural plant species) can only be improved by increasing at least one of the three yield components in eq. 4. Therefore, any successful intervention at the genomic level should target a cellular process which has a direct mechanistic link to an increase in one of the three components when a given crop is experiencing a drought (Ludlow and Muchow, 1990).
Accordingly, a rationale based on a physiological/ mechanistic framework as shown above should be employed in sifting through the large volume of information that is being generated not only on the genes, but also on the proteins (generated in Proteomics) and metabolites (Metabolomics) synthesized via metabolic pathways controlled by the identified genes. Such an approach is more likely to lead to identification of key genes and control pathways having direct influences on the determination of yield components, either morphological (eqs. 1 & 2) or mechanistic (eqs. 3 & 4), and bring about yield increases of crops in the field. Use of Plant/Crop Systems Biology to construct models of plant/crop functioning by integrating the vast amount of information available from functional genomics, proteomics and metabolomics could help link the sub-cellular and cellular information with the required plant/crop traits to achieve yield increases in the field. In parallel to Plant Systems Biology, use of Crop Simulation Modelling to simulate field performance of W.A.J.M. De Costa plant populations (crops) could identify processes and factors which limit crop yields in different environments and management regimes (Boote et al., 2001). This could provide a crucial link between molecular biological work and practical plant breeding bybringing in to focus processes and pathways towards which molecular work should be directed (Hammer et al., 2002;Yin et al., 2018). Recent improvements in Crop Simulation Modelling such as incorporation of 'Whole Genome Prediction' has expanded the potential for identification of desirable plant traits for higher-yielding new genotypes while taking in to account genotype x environment x management interactions and allows more efficient linking of genomics, crop physiology and plant breeding across different levels of plant organization (Messina et al., 2018).
Even when such a rational approach grounded in mechanistic pathways of crop growth and yield formation is employed in molecular biological work aimed at raising crop yields, potential complications abound. There can be negative feedbacks among the yields components listed in eqs. 1 -4. For example, even when genetic modification enables successful increase of a yield component, it may not necessarily increase the yield because of compensatory mechanisms, which are often reported, between yield components. For example, increasing the number of aggregate structures (e.g. panicles) per unit land area could lead to decreased numbers of harvestable organs (e.g. grains) per aggregate structure. Similarly, number of harvestable organs could be negatively correlated to the fraction of filled organs and/or the mean individual organ weight. Specifically, cereals and grain legumes, increased grain number per panicle or pod is often associated with decreased weight of an individual grain (Donald and Hamblin, 1976). Yin and Struik (2008) provide several specific examples from rice and wheat where genomic interventions successfully raising a yield component, but being unlikely to increase the yield because of failure to explore the negative correlations with other yield components. In citing these examples, Yin and Struik (2008) emphasizes the need for functional genomics to be linked to crop physiology for better utilization of its techniques to achieve yield improvement in crops. Furthermore, higher seed growth rates can be correlated with lower durations for seed growth. This can happen, especially in grain legumes where the high demand for nitrogen from the seeds leads to re-translocation leaf nitrogen via breakdown of Rubisco causing accelerated leaf senescence and curtailed crop duration (Sinclair and de Wit, 1975).
Among the mechanistic yield components, greater water use efficiency (WUE) could be correlated with lower total water use and/or lower harvest index and lower yield (Araus et al., 2002;Blum, 2005;Tambussi et al., 2007;Blum, 2009). Leaf carbon isotope discrimination (Δ), which has been proposed as a leaf-level surrogate to select gentoypes/breeding lines with greater WUE (Farquhar and Richards, 1984;Farquhar et al., 1989), provides another example of feedbacks and trade-offs when leaf-level processes are scaled-up to the crop level. Based on theoretical analysis of the discrimination against the naturally-occurring heavier 13 C isotope in CO 2 during its absorption and subsequent use in the Calvin cycle, Farquhar and Richards (1984) predicted Δ to be negativelycorrelated with WUE and therefore yield (according to eq. 4). However, actual field-testing showed both the expected negative correlation Rebetzke et al., 2002) and an unexpected positive correlation (Condon et al., 1987;Hall et al., 1994;Condon et al., 2004) while Seibt et al. (2008) discusses possible complicating factors during up-scaling across spatial and temporal scales. Blum (2009) provides a strong argument, with supporting experimental evidence, for a negative correlation between total water use and WUE in eq. 4, which highlights that achievement of higher yields in water-limited environments (i.e. drought tolerance) requires mechanisms that allow plants to increase, rather than restrict, the use of available water (i.e. a 'water-spending' rather than a 'water-conserving' strategy).

CONCLUDING REMARKS
Negative feedbacks and trade-offs when plant processes are scaled-up from sub-cellular to plant population/ community levels and their interactions with variations in the growing environment highlight the extremely-complex and highly-regulated nature of the metabolic pathways that determine crop yields. An appreciation and an understanding of this complexity will help focus molecular biological work to increase its contribution towards plant breeding efforts to raise crop yields (Mittler and Blumwald, 2010;Langridge and Fleury, 2011). A multi-disciplinary approach with better collaboration between molecular biologists, plant/crop physiologists, systems analysts, experts in bioinformatics, agronomists and plant breeders, instead of each group working in isolation as it frequently happens, is needed to bring an adequate return (in terms of increased food production and reduced hunger and poverty, especially in rural, low-income populations) to the substantial investment of public funds in to research in these disciplines.