Aegilops speltoides Assembly & Annotation

Overview

Analysis Name Aegilops speltoides Assembly & Annotation
Sequencing technology Oxford Nanopore
Assembly method wtdbg2 version 2
Release Date 2022-04-12
Reference Publication(s)

Li LF, Zhang ZB, Wang ZH, Li N, Sha Y, Wang XF, Ding N, Li Y, Zhao J, Wu Y, Gong L, Mafessoni F, Levy AA, Liu B. Genome sequences of five Sitopsis species of Aegilops and the origin of polyploid wheat B subgenome. Mol Plant. 2022 Mar 7;15(3):488-503. doi: 10.1016/j.molp.2021.12.019.

Abstract

Common wheat (Triticum aestivum, BBAADD) is a major staple food crop worldwide. The diploid progenitors of the A and D subgenomes have been unequivocally identified; that of B, however, remains ambiguous and controversial but is suspected to be related to species of Aegilops, section Sitopsis. Here, we report the assembly of chromosome-level genome sequences of all five Sitopsis species, namely Aegilops speltoides, Ae. longissima, Ae. speltoides, Ae. sharonensis, and Ae. speltoides, as well as the partial assembly of the Amblyopyrum muticum (synonym Aegilops mutica) genome for phylogenetic analysis. Our results reveal that the donor of the common wheat B subgenome is a distinct, and most probably extinct, diploid species that diverged from an ancestral progenitor of the B lineage to which the still extant Ae. speltoides and Am. muticum belong. In addition, we identified interspecific genetic introgressions throughout the evolution of the Triticum/Aegilops species complex. The five Sitopsis species have various assembled genome sizes (4.11–5.89 Gb) with high proportions of repetitive sequences (85.99%–89.81%); nonetheless, they retain high collinearity with other genomes or subgenomes of species in the Triticum/Aegilops complex. Differences in genome size were primarily due to independent post-speciation amplification of transposons. We also identified a set of Sitopsis genes pertinent to important agronomic traits that can be harnessed for wheat breeding. These newly assembled genome resources provide a new roadmap for evolutionary and genetic studies of the Triticum/Aegilops complex, as well as for wheat improvement.

Assembly statistics

Genome size (bp) 4,110,737,152
GC content 46.34%
Chromosomes sequence No. 7
Genome sequence No. 10,842
Maximum genome sequence length (bp) 597,334,302
Minimum genome sequence length (bp) 1,744
Average genome sequence length (bp) 379,149
Genome sequence N50 (bp) 530,918,953
Genome sequence N90 (bp) 470,101,627
Assembly level Chromosome

Assembly

The Aegilops speltoides Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) GWHBFXR00000000.1.genome.fasta.gz

Gene Predictions

The Aegilops speltoides genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) GWHBFXR00000000.1.gff.gz
CDS sequences (FASTA file) GWHBFXR00000000.1.RNA.fasta.gz
Protein sequences (FASTA file) GWHBFXR00000000.1.Protein.faa.gz

Functional Analysis

Functional annotation for the Aegilops speltoides is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Aegilops_speltoides.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247I-SGWHBFXR00000001.1470101627104122359-104123960LpSDUF247-I_chromosome180DUF247
DUF247II-SGWHBFXR00000001.1470101627103990663-103992231LpSDUF247-II_chromosome173DUF247
HPS10-SGWHBFXR00000001.1470101627104107428-104107602,
104107864-104107997
LpsS_contig1294860-
DUF247I-ZGWHBFXR00000002.1583316874544185491-544187086LpZDUF247-I_chromosome258DUF247
DUF247II-ZΨGWHBFXR00000002.1583316874544214992-544216095AlongiglumisDUF247II-Z49DUF247

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences