Saccharum spontaneum Np-X Assembly & Annotation

Overview

Analysis Name Saccharum spontaneum Np-X Assembly & Annotation
Sequencing technology Illumina; PacBio
Assembly method Canu v. 1.9
Release Date 2022-03-04
Reference Publication(s)

Zhang Q, Qi Y, Pan H, Tang H, Wang G, Hua X, Wang Y, Lin L, Li Z, Li Y, Yu F, Yu Z, Huang Y, Wang T, Ma P, Dou M, Sun Z, Wang Y, Wang H, Zhang X, Yao W, Wang Y, Liu X, Wang M, Wang J, Deng Z, Xu J, Yang Q, Liu Z, Chen B, Zhang M, Ming R, Zhang J. Genomic insights into the recent chromosome reduction of autopolyploid sugarcane Saccharum spontaneum. Nat Genet. 2022 Jun;54(6):885-896. doi: 10.1038/s41588-022-01084-1.

Abstract

Saccharum spontaneum is a founding Saccharum species and exhibits wide variation in ploidy levels. We have assembled a high-quality autopolyploid genome of S. spontaneum Np-X (2n = 4x = 40) into 40 pseudochromosomes across 10 homologous groups, that better elucidates recent chromosome reduction and polyploidization that occurred circa 1.5 million years ago (Mya). One paleo-duplicated chromosomal pair in Saccharum, NpChr5 and NpChr8, underwent fission followed by fusion accompanied by centromeric split around 0.80 Mya. We inferred that Np-X, with x = 10, most likely represents the ancestral karyotype, from which x = 9 and x = 8 evolved. Resequencing of 102 S. spontaneum accessions revealed that S. spontaneum originated in northern India from an x = 10 ancestor, which then radiated into four major groups across the Indian subcontinent, China, and Southeast Asia. Our study suggests new directions for accelerating sugarcane improvement and expands our knowledge of the evolution of autopolyploids.

Assembly statistics

Genome size2.8 Gb
Total ungapped length2.8 Gb
Number of chromosomes40
Number of scaffolds1,033
Scaffold N5068.6 Mb
Scaffold L5017
Number of contigs15,510
Contig N50381.9 kb
Contig L502,133
GC percent44.5
Genome coverage18.0x
Assembly levelChromosome

Assembly

The Saccharum spontaneum Np-X Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Saccharum_spontaneum_NpX.assembly.fna.gz

Gene Predictions

The Saccharum spontaneum Np-X genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) Saccharum_spontaneum_NpX.gff3.gz
CDS sequences (FASTA file) Saccharum_spontaneum_NpX.cds.fna.gz
Protein sequences (FASTA file) Saccharum_spontaneum_NpX.protein.faa.gz

Functional Analysis

Functional annotation for the Saccharum spontaneum Np-X is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Saccharum_spontaneum.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247I-S1Chr10A6208622336843964-36845649Shybrid70DUF247
DUF247I-S2Chr10B6277658135239746-35241422Shybrid67DUF247
DUF247I-S3Chr10C5910753635474463-35476151Shybrid66DUF247
DUF247I-S4Chr10C5910753635391752-35393440Shybrid66DUF247
DUF247I-S5Chr10D5991735233271657-33273345Shybrid71DUF247
DUF247II-S1Chr10A6208622336638159-36639805Shybrid64DUF247
DUF247II-S2Chr10B6277658134555297-34556922Shybrid60DUF247
DUF247II-S3Chr10C5910753634185050-34186663Shybrid61DUF247
DUF247II-S4Chr10D5991735232758250-32759872Shybrid61DUF247
HPS10-S1Chr10A6208622336642060-36642163,
36642249-36642414
ShybridS173-
HPS10-S2Chr10B6277658134559513-34559619,
34559697-34559859
ShybridS158-
HPS10-S3Chr10C5910753634188680-34188842,
34188964-34189073
ShybridS154-
HPS10-S4Chr10C5910753634213383-34213492,
34213614-34213776
ShybridS154-
HPS10-S5Chr10D5991735233844600-33844759,
33844908-33845020
ShybridS167-
DUF247I-Z1Chr6A5778183351728058-51729647Shybrid62DUF247
DUF247I-Z2Chr6B5943196550372017-50373600Shybrid62DUF247
DUF247I-Z3Chr6C6497117958793187-58794812Shybrid58DUF247
DUF247I-Z4Chr6D6095163555855053-55856756Shybrid61DUF247
DUF247II-Z1Chr6A5778183351776753-51778411Shybrid50DUF247
DUF247II-Z2Chr6B5943196550248865-50250505Shybrid53DUF247
DUF247II-Z3Chr6C6497117958806801-58808447Shybrid49DUF247
DUF247II-Z4Chr6D6095163555869760-55871466Shybrid50DUF247
HPS10-Z1Chr6A5778183351764237-51764393,
51764465-51764574
ShybridZ542-
HPS10-Z2Chr6B5943196550259004-50259110,
50259208-50259367
ShybridZ656-
HPS10-Z3Chr6C6497117958804978-58805146,
58805242-58805354
Scereale64-
HPS10-Z4Chr6D6095163555868527-55868680,
55868792-55868880
ShybridZ535-

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences