Cassava Brown Streak Virus Infection Genome

Cassava Brown Streak Virus Infection Genome. Genome-wide prediction and association analysis for sensitivity to cassava brown streak virus infection in Cassava Siraj Ismail Kayondo, Dunia Pino Del Carpio, Roberto Lozano, Alfred Ozimati, Marnin Wolfe, Yona Baguma, Vernon Gracen, Offei Samuel, Robert Kawuki and Jean-Luc Jannink ABSTRACT Cassava (manihot esculenta Crantz), a key carbohydrate source faces unprecedented challenge of viral diseases importantly, cassava brown streak disease (CBSD) and cassava mosaic disease (CMD). The economic parts of the crop are rendered unmarketable by these viral diseases resulting into mega fiscal setbacks. The remarkable completion of the cassava genome sequence equips cassava breeders with more precise selection strategies to offer superior varieties with both farmer and industry preferred traits. This article reports genomic segments associated to foliar and root CBSV sensitivity measured at different growth stages and environmental conditions. We identified significant single nucleotide polymorphisms (SNPs) associated to CBSV sensitivity in cassava on chromosome 4 and 11. The significantly associated regions on chromosome 4 co-localises with a Manihot glaziovii introgression from the wild progenitors. While significant SNPs markers on chromosome 11 are in linkage disequilibrium (LD) with a cluster of nucleotide-binding site leucine-rich repeat (NBS-LRR) proteins encoded by disease resistance genes in plants. Genotype by environmental interactions were significant since SNP marker effects differed across environments and years. Key words: Genome-wide association studies (GWAS), virus sensitivity, augmented designs, de-regressed best linear unbiased Predictions (dr-BLUPs), NBS-LRR proteins, QTLs INTRODUCTION Cassava (Manihot esculenta crantz), is a major source of income and dietary calories for over a billion lives across the globe especially in Sub Saharan Africa (SSA). Edge cutting technologies are rapidly turning cassava into an industrial crop especially tapping into it’s unique starch qualities hence opening new income opportunities for the poor (Pérez et al., 2011). Cassava brown streak virus disease (CBSD), a leading viral constraint limiting production across SSA is responsible for mega fiscal setbacks estimated at 100 US million dollars per annum at physiological maturity (ASARECA:, 2013; Ndunguru et al., 2015). As a consequence of CBSVs, cassava yields were recorded to be eight times lower than the expected yield potential in Uganda(). Two major strains; Cassava brown streak virus (CBSV) and Uganda Cassava brown streak virus (UCBSV), have successfully colonized both the lowland and highland altitudes across East Africa though newer strains are being reported (Winter et al., 2010; Ndunguru et al., 2015; Alicai et al., 2016a). In addition to uncontrolled exchange of infected cassava steaks among cassava farmers across porous borders, the African whitefly (Besimia tobaci) stands out as the famous semi-persistent virus transmitter under field conditions (Legg et al., 2014; McQuaid et al., 2015). Upon entry, the virus exploits the plant’s transport system to traverse the susceptible cassava plant resulting into yellow chlorotic vein clearing patterns along minor veins of the leaves. Prominent brown elongated lesions are formed on the stem commonly referred to as “brown streaks”. While the brown necrotic hard-corky layers are randomly formed in the root cortex of most susceptible cassava clones. In view of the rapid but steadily virus evolution rates and the insufficiency of dependable virus diagnostic tools (Alicai et al., 2016b), breeding for durable CBSD resistance emerges as a timely and economically viable option. Earlier CBSD resistance breeding initiatives have highlighted it’s polygenic but recessive nature of inheritance in both intraspecific and interspecific cassava hybrids (Nichols, 1947; Hillocks and Jennings, 2003; Munga, 2008; Kulembeka, 2010). The rate of progress to genetic improvement in a traditional cassava breeding pipeline has been slower due to several biology-related opportunities like; shy flowering, length of breeding cycle, limited genetic diversity and slow rate of multiplication of planting materials. Most of the available elite cassava lines have exhibited some level of sensitivity to CBSVs ranging from mild sensitivity total susceptibility. Therefore, a concise but then targeted exploration for potential sources of resistance using the available biotechnology tools could be a promising strategy. The remarkable completion of the cassava genome sequence equips cassava breeders with more precise selection strategies to offer superior varieties with both farmer and industry preferred traits. A study by Bredeson et al., (2016) reports the presence of introgressions segments from the wild progenitors into the elite breeding lines developed by the Amani breeding program in Tanzania. Hence, resistance sources to CBSD exist but may have been reshuffled over generations of recurrent selection thus not fully fixed and need to be exploited. Moving forward, a genome wide survey for existing natural variations as explained by the observed phenotypes for a given series of agronomic traits could facilitate identification of causal loci associated with the inheritance of a trait of interest. This tool, commonly referred to as genome-wide association study (GWAS) exploits the power of statistical analyses to identify such historical recombination events that have occurred over time (Jannink and Walsh, 2002; Hamblin, Buckler and Jannink, 2011). Hence, GWA-studies will complement bi-parental mapping efforts that have been widely applied in cassava breeding in the previous decade (Ferguson et al., 2012; Ceballos et al., 2015). GWA-studies have been widely undertaken by animal, human and plant geneticists to identify quantitative trait loci (QTLs) in close association to several important traits. However, GWAS has been thinly applied in cassava breeding especially in the definition of the genetic architecture of cassava mosaic disease (Wolfe et al., 2016) and beta carotene (unpublished). In this study, we exploited the reduced genotyping costs using genotyping by sequencing (GBS) to genotype data for our association mapping panel. The goal of this study was to identify genomic regions closely associated with sensitivity to CBSV infection in a diverse regional cassava breeding panel. Fine mapping around the identified regions would guide in marker discovery as well as identification of franking genes for CBSV sensitivity for marker assisted breeding. MATERIALS AND METHODS Plant material The data set comprised of field disease evaluations undertaken across five locations; Namulonge, Kamuli, Serere, Ngetta and Kasese in Uganda. Two different but closely related GWAS panels were evaluated across environments. Between 2012 and 2013, GWAS panel 1 consisted of between 308 to 429 entries that were replicated twice across three locations. Each trial was designed as a randomized complete block (RCB) with two-row plots of five plants each at a spacing of 1 meter by 1 meter. In 2015, GWAS panel 2 consisting of entries ranging from 715 to 872 clones was evaluated in three locations but contrasting sites for CBSD pressure. These entries were evaluated as single entries per site being connected by six common checks in an augmented completely randomized block design with 38 blocks per site (Federer, Nguyen and others, 2002; Federer and Crossa, 2012). The two GWAS panels had one location in common; Namulonge that is regarded as the CBSD hot spot with the highest CBSD pressure. The data was generated from 1281 cassava clones developed through three cycles of genetic recombination with local elite lines by the National root crops breeding program at NaCRRI. These cassava clones had a diverse genetic background whose pedigree could be traced back to introductions from international institute for tropical agriculture (IITA), International center for tropical Agriculture (CIAT) and Tanzania[KI1] breeding program (sup.fig1). Phenotyping protocol for CBSV sensitivity The key traits were CBSD severity and incidence scored at 3, 6, and 9 months after planting (MAP) for foliar and 12 MAP for root symptoms respectively. CBSD severity was measured based on a 5 point scale with a score of 1 implying asymptomatic conditions and a score 5 implying over 50% leaf vain clearing under foliar symptoms. However, at 12 MAP a score of 5 implies over 50% of root-core being covered by a necrotic corky layer. (fig.1) Clones were classified with a score of 5 if pronounced vein clearing at major leaf veins were jointly displayed with brown streaks on the stems and shoot die-back that appeared as a candle-stick. Clones with 31 – 40% leaf vein clearing together with brown steaks at the stems were classified under score 4. A Score of 3 was assigned to clones with 21 – 30% leaf vein clearing with emerging brown streaks on the stems. While a score of 2 was assigned to clones that only displayed 1 – 20% leaf vein clearing without any visible brown streak symptoms on the stems. Plants classified with a score of 1 showed no visible sign of leaf necrosis and brown streaks on the stems. On the other hand, root symptoms were also classified into 5 different categories based on a 5 – point standard scale. Two-stage genomic analyses For the panel 1 which was designed as a randomized complete block (RCB) we fit the model: , using the lmer function from the lme4 R package (Bates et al., 2015).In this model, β included a fixed effect for the population mean and location. The incidence matrix Zclone and the vector c represent a random effect for clone and I represent the identity matrix. The range variable, which is the row or column along which plots are arrayed, is nested in location-rep and is represented by the incidence matrix Zrange(loc.) and random effects vector .Block effects were nested in ranges and incorporated as random with incidence matrix Zblock(range) and effects vector . Residuals were fit as random, with . For panel 2, which followed an augmented design, we fit the model Where y was the vector of raw phenotypes, β included a fixed effect for the population mean and location with checks included as a covariate, The incidence matrix Zclone and the vector c are the same as above and the blocks were also modeled with incidence matrix and b represents the random effect for block. The best linear predictors (BLUPs) of the clone effect (ĉ) were extracted as de-regressed BLUPS following the formula: Broad sense heritability was calculated using variance components extracted from the two step lmer output. SNP-based heritability was calculated by extracting the variance components from the output obtained by fitting the SNPs as a kinship covariate calculated using the A.mat function from the rrBLUP R package and included in a one step model using the emmreml function from the EMMREML R package (Akdemir and Okeke, 2015). DNA preparation and Genotyping by sequencing (GBS) All cassava clones included in the phenotypic data set had their total genomic DNA extracted from young tender leaves according to standard procedures using the DNAeasy plant mini extraction kit (Qiagen, 2012). Genotyping-by-sequencing (GBS) (Elshire et al., 2011) libraries were constructed using the ApeKI restriction enzyme as used before (Hamblin
