Hordeum spontaneum EC_S1 Assembly & Annotation

Overview

Analysis Name Hordeum spontaneum EC_S1 Assembly & Annotation
Sequencing technology Nanopore, Illumina, Hi-C
Assembly method NextDenovo
Release Date 2023-08-10
Reference Publication(s)

Pan R, Hu H, Xiao Y, Xu L, Xu Y, Ouyang K, Li C, He T, Zhang W. High-quality wild barley genome assemblies and annotation with Nanopore long reads and Hi-C sequencing data. Sci Data. 2023 Aug 10;10(1):535. doi: 10.1038/s41597-023-02434-2.

Abstract

Wild barley, from "Evolution Canyon (EC)" in Mount Carmel, Israel, are ideal models for cereal chromosome evolution studies. Here, the wild barley EC_S1 is from the south slope with higher daily temperatures and drought, while EC_N1 is from the north slope with a cooler climate and higher relative humidity, which results in a differentiated selection due to contrasting environments. We assembled a 5.03 Gb genome with contig N50 of 3.53 Mb for wild barley EC_S1 and a 5.05 Gb genome with contig N50 of 3.45 Mb for EC_N1 using 145 Gb and 160.0 Gb Illumina sequencing data, 295.6 Gb and 285.35 Gb Nanopore sequencing data and 555.1 Gb and 514.5 Gb Hi-C sequencing data, respectively. BUSCOs and CEGMA evaluation suggested highly complete assemblies. Using full-length transcriptome data, we predicted 39,179 and 38,373 high-confidence genes in EC_S1 and EC_N1, in which 93.6% and 95.2% were functionally annotated, respectively. We annotated repetitive elements and non-coding RNAs. These two wild barley genome assemblies will provide a rich gene pool for domesticated barley.

Assembly statistics

Genome size5,025,137,494 bp
Scaffold N5090,435,441 bp
Contig N503,525,661 bp
Assembly levelChromosome

Assembly

The Hordeum spontaneum EC_S1 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) GCA_029783385.1_ASM2978338v1_genomic.fna.gz

Gene Predictions

The Hordeum spontaneum EC_S1 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) EC_N1.gff.gz
CDS sequences (FASTA file) EC_N1.cds.fasta.gz
Protein sequences (FASTA file) EC_N1.pep.gz

Functional Analysis

Functional annotation for the Hordeum spontaneum EC_S1 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Hordeum_spontaneum.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247II-SCM056660.156083650284744224-84745902LpSDUF247-II_chromosome175DUF247
HPS10-ZCM056661.1725682262678335003-678335144,
678335271-678335359
Amyosuroides36-

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences