Analysis Name | Cleistogenes songorica v1.1 Assembly & Annotation |
Sequencing technology | PacBio |
Assembly method | GS De Novo Assembler v1.01 |
Release Date | 2020-11-26 |
Zhang J, Wu F, Yan Q, John UP, Cao M, Xu P, Zhang Z, Ma T, Zong X, Li J, Liu R, Zhang Y, Zhao Y, Kanzana G, Lv Y, Nan Z, Spangenberg G, Wang Y. The genome of Cleistogenes songorica provides a blueprint for functional dissection of dimorphic flower differentiation and drought adaptability. Plant Biotechnol J. 2021 Mar;19(3):532-547. doi: 10.1111/pbi.13483.
SummaryCleistogenes songorica (2n = 4x = 40) is a desert grass with a unique dimorphic flowering mechanism and an ability to survive extreme drought. Little is known about the genetics underlying drought tolerance and its reproductive adaptability. Here, we sequenced and assembled a high-quality chromosome-level C. songorica genome (contig N50 = 21.28 Mb). Complete assemblies of all telomeres, and of ten chromosomes were derived. C. songorica underwent a recent tetraploidization (~19 million years ago) and four major chromosomal rearrangements. Expanded genes were significantly enriched in fatty acid elongation, phenylpropanoid biosynthesis, starch and sucrose metabolism, and circadian rhythm pathways. By comparative transcriptomic analysis we found that conserved drought tolerance related genes were expanded. Transcription of CsMYB genes was associated with differential development of chasmogamous and cleistogamous flowers, as well as drought tolerance. Furthermore, we found that regulation modules encompassing miRNA, transcription factors and target genes are involved in dimorphic flower development, validated by overexpression of CsAP2_9 and its targeted miR172 in rice. Our findings enable further understanding of the mechanisms of drought tolerance and flowering in C. songorica, and provide new insights into the adaptability of native grass species in evolution, along with potential resources for trait improvement in agronomically important species.
Assembly statistics
Genome size (bp) | 540,115,686 |
GC content | 45.01% |
Genome sequence No. | 103 |
Maximum genome sequence length (bp) | 38,883,845 |
Minimum genome sequence length (bp) | 44,100 |
Average genome sequence length (bp) | 5,243,842 |
Genome sequence N50 (bp) | 28,480,272 |
Genome sequence N90 (bp) | 19,463,092 |
Assembly level | Chromosome |
The Cleistogenes songorica v1.1 Assembly file is available in FASTA format.
Downloads
Chromosomes (FASTA file) | GWHANUQ00000000.genome.fasta.gz |
The Cleistogenes songorica v1.1 genome gene prediction files are available in GFF3 and FASTA format.
Downloads
Genes (GFF3 file) | GWHANUQ00000000.gff.gz |
CDS sequences (FASTA file) | GWHANUQ00000000.RNA.fasta.gz |
Protein sequences (FASTA file) | GWHANUQ00000000.Protein.faa.gz |
Functional annotation for the Cleistogenes songorica v1.1 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).
Downloads
Domain from InterProScan | Cleistogenes_songorica.Pfam.tsv.gz |
Summary
Query | Chromosome | Size(bp) | Coordinates | tBLASTn Hit | tBLASTn %ID | Domain |
DUF247II-Z1Ψ | GWHANUQ00000003 | 26015858 | 2135864-2135947 | Mlutarioriparius | 76 | DUF247 |
DUF247II-Z2Ψ | GWHANUQ00000017 | 31529402 | 1899121-1899405 | Ecrus-galli | 80 | DUF247 |
HPS10-Z1 | GWHANUQ00000003 | 26015858 | 2138346-2138478, 2138623-2138732 | LpsZ_chromosome2 | 26 | - |
HPS10-Z2 | GWHANUQ00000017 | 31529402 | 1891555-1891652, 1891795-1891927 | AerianthaHPS10-Z | 51 | - |
Nucleotide
Protein