Triticum monococcum PI306540 Assembly & Annotation

Overview

Analysis Name Triticum monococcum PI306540 Assembly & Annotation
Sequencing technology Pacbio_HiFi
Assembly method hifiasm v0.16.1-r375
Release Date 2023-11-11
Reference Publication(s)

Ahmed HI, Heuberger M, Schoen A, Koo DH, Quiroz-Chavez J, Adhikari L, Raupp J, Cauet S, Rodde N, Cravero C, Callot C, Lazo GR, Kathiresan N, Sharma PK, Moot I, Yadav IS, Singh L, Saripalli G, Rawat N, Datla R, Athiyannan N, Ramirez-Gonzalez RH, Uauy C, Wicker T, Tiwari VK, Abrouk M, Poland J, Krattinger SG. Einkorn genomics sheds light on history of the oldest domesticated wheat. Nature. 2023 Aug;620(7975):830-838. doi: 10.1038/s41586-023-06389-7.

Abstract

Einkorn (Triticum monococcum) was the first domesticated wheat species, and was central to the birth of agriculture and the Neolithic Revolution in the Fertile Crescent around 10,000 years ago. Here we generate and analyse 5.2-Gb genome assemblies for wild and domesticated einkorn, including completely assembled centromeres. Einkorn centromeres are highly dynamic, showing evidence of ancient and recent centromere shifts caused by structural rearrangements. Whole-genome sequencing analysis of a diversity panel uncovered the population structure and evolutionary history of einkorn, revealing complex patterns of hybridizations and introgressions after the dispersal of domesticated einkorn from the Fertile Crescent. We also show that around 1% of the modern bread wheat (Triticum aestivum) A subgenome originates from einkorn. These resources and findings highlight the history of einkorn evolution and provide a basis to accelerate the genomics-assisted improvement of einkorn and bread wheat.

Assembly statistics

Genome size (bp)5,116,257,294
GC content46.29%
Chromosomes sequence No.7
Genome sequence No.15
Maximum genome sequence length (bp)829,514,116
Minimum genome sequence length (bp)142,272
Average genome sequence length (bp)341,083,820
Genome sequence N50 (bp)728,660,354
Genome sequence N90 (bp)636,012,229
Assembly levelChromosome

Assembly

The Triticum monococcum PI306540 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) GWHCBHO00000000.1.genome.fasta.gz

Gene Predictions

The Triticum monococcum PI306540 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) GWHCBHO00000000.1.gff.gz
CDS sequences (FASTA file) GWHCBHO00000000.1.RNA.fasta.gz
Protein sequences (FASTA file) GWHCBHO00000000.1.Protein.faa.gz

Functional Analysis

Functional annotation for the Triticum monococcum PI306540 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Triticum_monococcum.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247I-SΨGWHCBHO00000001.1644540180101722013-101723023LpSDUF247-I_chromosome179DUF247
DUF247II-SΨGWHCBHO00000001.1644540180101116510-101116785LpSDUF247-II_chromosome176DUF247
HPS10-SGWHCBHO00000001.1644540180101712392-101712518,
101712666-101712769
LpsS_contig1294866-
HPS10-ZGWHCBHO00000002.1829514116763963982-763964093,
763964214-763964344
Hmarinum83-

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences