Saccharum officinarum x spontaneum R570 v2.1 Assembly & Annotation

Overview

Analysis Name Saccharum officinarum x spontaneum R570 v2.1 Assembly & Annotation
Sequencing technology PacBio Sequel II; Illumina HiSeq-2500
Assembly method HiFiAsm v. 1; RACON v. 1.4.10
Release Date 2023-03-22
Reference Publication(s)

Healey AL, Garsmeur O, Lovell JT, Shengquiang S, Sreedasyam A, Jenkins J, Plott CB, Piperidis N, Pompidor N, Llaca V, Metcalfe CJ, Doležel J, Cápal P, Carlson JW, Hoarau JY, Hervouet C, Zini C, Dievart A, Lipzen A, Williams M, Boston LB, Webber J, Keymanesh K, Tejomurthula S, Rajasekar S, Suchecki R, Furtado A, May G, Parakkal P, Simmons BA, Barry K, Henry RJ, Grimwood J, Aitken KS, Schmutz J, D'Hont A. The complex polyploid genome architecture of sugarcane. Nature. 2024 Apr;628(8009):804-810. doi: 10.1038/s41586-024-07231-4.

Abstract

Sugarcane, the world’s most harvested crop by tonnage, has shaped global history, trade and geopolitics, and is currently responsible for 80% of sugar production worldwide1. While traditional sugarcane breeding methods have effectively generated cultivars adapted to new environments and pathogens, sugar yield improvements have recently plateaued2. The cessation of yield gains may be due to limited genetic diversity within breeding populations, long breeding cycles and the complexity of its genome, the latter preventing breeders from taking advantage of the recent explosion of whole-genome sequencing that has benefited many other crops. Thus, modern sugarcane hybrids are the last remaining major crop without a reference-quality genome. Here we take a major step towards advancing sugarcane biotechnology by generating a polyploid reference genome for R570, a typical modern cultivar derived from interspecific hybridization between the domesticated species (Saccharum officinarum) and the wild species (Saccharum spontaneum). In contrast to the existing single haplotype (‘monoploid’) representation of R570, our 8.7 billion base assembly contains a complete representation of unique DNA sequences across the approximately 12 chromosome copies in this polyploid genome. Using this highly contiguous genome assembly, we filled a previously unsized gap within an R570 physical genetic map to describe the likely causal genes underlying the single-copy Bru1 brown rust resistance locus. This polyploid genome assembly with fine-grain descriptions of genome architecture and molecular targets for biotechnology will help accelerate molecular and transgenic breeding and adaptation of sugarcane to future environmental conditions.

Assembly statistics

Assembly Source:JGI
Assembly Version:v2.0
Annotation Source:JGI
Annotation Version:v2.1
Total Scaffold Length (bp):5,046,770,891
Number of Scaffolds:144
Min. Number of Scaffolds containing half of assembly (L50):28
Shortest Scaffold from L50 set (N50):79,221,035
Total Contig Length (bp):5,042,101,904
Number of Contigs:842
Min. Number of Contigs containing half of assembly (L50):99
Shortest Contig from L50 set (N50):15,340,496
Number of Protein-coding Transcripts:299,731
Number of Protein-coding Genes:194,593
Percentage of Eukaryote BUSCO Genes:98.7
Percentage of Embroyphyte BUSCO Genes:99.8
Assembly level:Chromosome

Assembly

The Saccharum officinarum x spontaneum R570 v2.1 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) SofficinarumxspontaneumR570_771_v2.0.fa.gz

Gene Predictions

The Saccharum officinarum x spontaneum R570 v2.1 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) SofficinarumxspontaneumR570_771_v2.1.gene.gff3.gz
CDS sequences (FASTA file) SofficinarumxspontaneumR570_771_v2.1.cds.fa.gz
Protein sequences (FASTA file) SofficinarumxspontaneumR570_771_v2.1.protein.fa.gz

Functional Analysis

Functional annotation for the Saccharum officinarum x spontaneum R570 v2.1 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Saccharum_x_sp.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247I-S1Chr4A8214722745451459-45453150Pnotatum68DUF247
DUF247I-S2Chr4B6733559335359872-35361542Pnotatum72DUF247
DUF247I-S3ΨChr4B6733559335352426-35353532Pnotatum76DUF247
DUF247I-S4Chr4C6674061834973006-34974700Pnotatum62DUF247
DUF247I-S5Chr4E5122495617425601-17427268Pnotatum64DUF247
DUF247I-S6Chr4F5004163018076523-18078217Pnotatum62DUF247
DUF247II-S1Chr4A8214722745118168-45119784Pnotatum64DUF247
DUF247II-S2Chr4B6733559335646182-35647807Pnotatum71DUF247
DUF247II-S3Chr4C6674061834944043-34945665Pnotatum65DUF247
DUF247II-S4Chr4D5661455237553438-37555051Pnotatum60DUF247
DUF247II-S5Chr4E5122495617162933-17164552Pnotatum61DUF247
DUF247II-S6Chr4F5004163016872511-16874133Pnotatum65DUF247
HPS10-S1Chr4A8214722746096791-46096953,
46097063-46097160
Pvaginatum62-
HPS10-S2Chr4B6733559335356085-35356194,
35356279-35356444
Pvaginatum47-
HPS10-S3Chr4B6733559335348653-35348762,
35348847-35349012
Pvaginatum47-
HPS10-S4Chr4C6674061834988926-34989044,
34989165-34989303
Pvaginatum51-
HPS10-S5Chr4E5122495617723108-17723220,
17723323-17723464
Pvaginatum50-
HPS10-S6Chr4F5004163018096861-18096979,
18097100-18097268
Pvaginatum51-
DUF247I-Z1ΨChr7A7134465063428166-63428759Pnotatum60DUF247
DUF247I-Z2Chr7B6847920561210945-61212525Pnotatum58DUF247
DUF247I-Z3Chr7C6330162256642183-56643748Pnotatum61DUF247
DUF247I-Z4Chr7D6252227255330487-55332142Pnotatum53DUF247
DUF247I-Z5ΨChr7E5789612750081317-50082054Pvaginatum71DUF247
DUF247I-Z6Chr7_10A9247784786187493-86189121Pvaginatum47DUF247
DUF247I-Z7Chr7os12845068524061534-24063138Pnotatum59DUF247
DUF247II-Z1ΨChr7A7134465063465565-63465750Trufipilum53DUF247
DUF247II-Z2Chr7B6847920561216457-61218127Pnotatum49DUF247
DUF247II-Z3Chr7C6330162256674475-56676148Pnotatum51DUF247
DUF247II-Z4Chr7D6252227255326178-55327854Pnotatum50DUF247
DUF247II-Z5Chr7_10A9247784786233541-86235211Pnotatum47DUF247
DUF247II-Z6Chr7os12845068524109889-24111520Pnotatum48DUF247
HPS10-Z1Chr7B6847920561214485-61214656,
61214963-61215117
Pvaginatum52-
HPS10-Z2Chr7C6330162256646243-56646334,
56646512-56646650
Olongistaminata75-
HPS10-Z3Chr7D6252227255328126-55328211,
55328481-55328667
Ocoarctata53-
HPS10-Z4Chr7E5789612750083900-50083994,
50084317-50084455
Orufipogon46-
HPS10-Z5Chr7_10A9247784786232194-86232356,
86232455-86232564
Pnotatum61-
HPS10-Z6Chr7os12845068524100177-24100348,
24100476-24100582
Pvaginatum60-

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences