Echinochloa colona eco_v1 Assembly & Annotation

Overview

Analysis Name Echinochloa colona eco_v1 Assembly & Annotation
Sequencing technology PacBio
Assembly method hifiasm v0.12-r304
Release Date 2022-01-10
Reference Publication(s)

Wu D, Shen E, Jiang B, Feng Y, Tang W, Lao S, Jia L, Lin HY, Xie L, Weng X, Dong C, Qian Q, Lin F, Xu H, Lu H, Cutti L, Chen H, Deng S, Guo L, Chuah TS, Song BK, Scarabel L, Qiu J, Zhu QH, Yu Q, Timko MP, Yamaguchi H, Merotto A Jr, Qiu Y, Olsen KM, Fan L, Ye CY. Genomic insights into the evolution of Echinochloa species as weed and orphan crop. Nat Commun. 2022 Feb 3;13(1):689. doi: 10.1038/s41467-022-28359-9.

Abstract

As one of the great survivors of the plant kingdom, barnyard grasses (Echinochloa spp.) are the most noxious and common weeds in paddy ecosystems. Meanwhile, at least two Echinochloa species have been domesticated and cultivated as millets. In order to better understand the genomic forces driving the evolution of Echinochloa species toward weed and crop characteristics, we assemble genomes of three Echinochloa species (allohexaploid E. crus-galli and E. colona, and allotetraploid E. oryzicola) and re-sequence 737 accessions of barnyard grasses and millets from 16 rice-producing countries. Phylogenomic and comparative genomic analyses reveal the complex and reticulate evolution in the speciation of Echinochloa polyploids and provide evidence of constrained disease-related gene copy numbers in Echinochloa. A population-level investigation uncovers deep population differentiation for local adaptation, multiple target-site herbicide resistance mutations of barnyard grasses, and limited domestication of barnyard millets. Our results provide genomic insights into the dual roles of Echinochloa species as weeds and crops as well as essential resources for studying plant polyploidization, adaptation, precision weed control and millet improvements.

Assembly statistics

Genome size (bp)1,127,172,362
GC content45.92%
Genome sequence No.908
Maximum genome sequence length (bp)64,906,485
Minimum genome sequence length (bp)14,717
Average genome sequence length (bp)1,241,379
Genome sequence N50 (bp)42,067,172
Genome sequence N90 (bp)28,545,782
Assembly levelChromosome

Assembly

The Echinochloa colona eco_v1 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) GWHBDNQ00000000.genome.fasta.gz

Gene Predictions

The Echinochloa colona eco_v1 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) GWHBDNQ00000000.gff.gz
CDS sequences (FASTA file) GWHBDNQ00000000.CDS.fasta.gz
Protein sequences (FASTA file) GWHBDNQ00000000.Protein.faa.gz

Functional Analysis

Functional annotation for the Echinochloa colona eco_v1 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Echinochloa_colona.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247I-SGWHBDNQ000000174251217918441003-18442667Shybrid62DUF247
DUF247II-SΨGWHBDNQ000000174251217918990254-18990520Shybrid58DUF247
HPS10-SGWHBDNQ000000174251217918985832-18985970,
18986120-18986223
ShybridS163-
DUF247I-Z1GWHBDNQ000000102885363426630623-26632200Shybrid57DUF247
DUF247I-Z2GWHBDNQ000000203995102637154008-37155516Shybrid60DUF247
DUF247I-Z3ΨGWHBDNQ000000273227177129984378-29984857Shybrid57DUF247
DUF247II-Z1GWHBDNQ000000203995102637161388-37163046Shybrid55DUF247
DUF247II-Z2GWHBDNQ000000273227177129992371-29994059Shybrid57DUF247
HPS10-Z1GWHBDNQ000000102885363426629108-26629217,
26629320-26629476
SspontaneumZ236-
HPS10-Z2GWHBDNQ000000203995102637159381-37159484,
37159596-37159722
Orufipogon35-

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences