Eragrostis tef Salk_teff_dabbi_3.0 Assembly & Annotation

Overview

Analysis Name Eragrostis tef Salk_teff_dabbi_3.0 Assembly & Annotation
Sequencing technology PacBio RSII; Illumina HiSeq
Assembly method Canu v. 1.4; 3D-DNA v. 180922
Release Date 2022-08-03
Reference Publication(s)

VanBuren R, Man Wai C, Wang X, Pardo J, Yocca AE, Wang H, Chaluvadi SR, Han G, Bryant D, Edger PP, Messing J, Sorrells ME, Mockler TC, Bennetzen JL, Michael TP. Exceptional subgenome stability and functional divergence in the allotetraploid Ethiopian cereal teff. Nat Commun. 2020 Feb 14;11(1):884. doi: 10.1038/s41467-020-14724-z.

Abstract

Teff (Eragrostis tef) is a cornerstone of food security in the Horn of Africa, where it is prized for stress resilience, grain nutrition, and market value. Here, we report a chromosome-scale assembly of allotetraploid teff (variety Dabbi) and patterns of subgenome dynamics. The teff genome contains two complete sets of homoeologous chromosomes, with most genes maintaining as syntenic gene pairs. TE analysis allows us to estimate that the teff polyploidy event occurred ~1.1 million years ago (mya) and that the two subgenomes diverged ~5.0 mya. Despite this divergence, we detect no large-scale structural rearrangements, homoeologous exchanges, or biased gene loss, in contrast to many other allopolyploids. The two teff subgenomes have partitioned their ancestral functions based on divergent expression across a diverse expression atlas. Together, these genomic resources will be useful for accelerating breeding of this underutilized grain crop and for fundamental insights into polyploid genome evolution.

Assembly statistics

Genome size575.1 Mb
Total ungapped length574.4 Mb
Number of chromosomes20
Number of scaffolds874
Scaffold N5027.1 Mb
Scaffold L509
Number of contigs1,541
Contig N501.4 Mb
Contig L50121
GC percent45.5
Genome coverage73.0x
Assembly levelChromosome

Assembly

The Eragrostis tef Salk_teff_dabbi_3.0 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Eragrostis_tef.faa.gz

Gene Predictions

The Eragrostis tef Salk_teff_dabbi_3.0 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) Eragrostis_tef.gff.gz
CDS sequences (FASTA file) Et_cds.fa.gz
Protein sequences (FASTA file) Et_pep.fa.gz

Functional Analysis

Functional annotation for the Eragrostis tef Salk_teff_dabbi_3.0 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Eragrostis_tef.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247I-S1Ψ2A3542588516547076-16548158Pvirgatum64DUF247
DUF247I-S22B3063364113934951-13936612Pvirgatum65DUF247
DUF247II-SΨ2A3542588516556070-16556345Pvirgatum62DUF247
HPS10-S12A3542588516548994-16549123,
16549350-16549459
Pvirgatum60-
HPS10-S22B3063364113941664-13941826,
13941942-13942048
Pvirgatum59-
DUF247I-Z17A264595002638547-2640109LpZDUF247-I_chromosome258DUF247
DUF247I-Z27B233834622307379-2308995LpZDUF247-I_chromosome258DUF247
DUF247II-Z17A264595002634780-2636447Shybrid57DUF247
DUF247II-Z2Ψ7B233834622302636-2303658Efulvus63DUF247
HPS10-Z17A264595002637270-2637370,
2637456-2637606
LpsZ_contig453859-
HPS10-Z27B2338346222621978-22622084,
22622151-22622289
LpsZ_chromosome259-

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences