Oryza eichingeri v1 Assembly & Annotation

Overview

Analysis Name Oryza eichingeri v1 Assembly & Annotation
Sequencing technology Illumina
Assembly method Platanus (version 1.2.1)
Release Date 2020-03-03
Reference Publication(s)

Shenton M, Kobayashi M, Terashima S, Ohyanagi H, Copetti D, Hernández-Hernández T, Zhang J, Ohmido N, Fujita M, Toyoda A, Ikawa H, Fujiyama A, Furuumi H, Miyabayashi T, Kubo T, Kudrna D, Wing R, Yano K, Nonomura KI, Sato Y, Kurata N. Evolution and Diversity of the Wild Rice Oryza officinalis Complex, across Continents, Genome Types, and Ploidy Levels. Genome Biol Evol. 2020 Apr 1;12(4):413-428. doi: 10.1093/gbe/evaa037.

Abstract

The Oryza officinalis complex is the largest species group in Oryza, with more than nine species from four continents, and is a tertiary gene pool that can be exploited in breeding programs for the improvement of cultivated rice. Most diploid and tetraploid members of this group have a C genome. Using a new reference C genome for the diploid species O. officinalis, and draft genomes for two other C genome diploid species Oryza eichingeri and Oryza rhizomatis, we examine the influence of transposable elements on genome structure and provide a detailed phylogeny and evolutionary history of the Oryza C genomes. The O. officinalis genome is 1.6 times larger than the A genome of cultivated Oryza sativa, mostly due to proliferation of Gypsy type long-terminal repeat transposable elements, but overall syntenic relationships are maintained with other Oryza genomes (A, B, and F). Draft genome assemblies of the two other C genome diploid species, Oryza eichingeri and Oryza rhizomatis, and short-read resequencing of a series of other C genome species and accessions reveal that after the divergence of the C genome progenitor, there was still a substantial degree of variation within the C genome species through proliferation and loss of both DNA and long-terminal repeat transposable elements. We provide a detailed phylogeny and evolutionary history of the Oryza C genomes and a genomic resource for the exploitation of the Oryza tertiary gene pool.

Assembly statistics

Assembly Size (Mb)471
Repeats (%)50.10
Annotated Loci31,030
Scaffold N50 (kb)64
Assembly levelScaffold

Assembly

The Oryza eichingeri v1 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Oeich_scaffolds.fasta.gz

Gene Predictions

The Oryza eichingeri v1 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) Oeich_maker_gene_annotation.gff.gz
CDS sequences (FASTA file) Oe_cds.fa.gz
Protein sequences (FASTA file) Oe_pep.fa.gz

Functional Analysis

Functional annotation for the Oryza eichingeri v1 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Oryza_eichingeri.Pfam.tsv.gz

S genes

Summary

QueryScaffoldSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247I-SΨscaffold47112820695774-96085Olongistaminata77DUF247
DUF247II-SΨscaffold47112820681243-82409Olongistaminata63DUF247
HPS10-Sscaffold47112820684440-84542,
84668-84798
LmsS_scaffold81837-
DUF247I-ZΨscaffold8487158629817-10779Dglomerata61DUF247
DUF247II-ZΨscaffold73010933812512-12733TturgidumZ263DUF247

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences