Oryza officinalis v1.0 Assembly & Annotation

Overview

Analysis Name Oryza officinalis v1.0 Assembly & Annotation
Sequencing technology Illumina HiSeq2500; Pacific Bio RSII
Assembly method Platanus v. 1.2.1; PBJelly v. 14.1.14; SSPACE v. 3.0; GapCloser v. 1.12
Release Date 2019-08-23
Reference Publication(s)

Shenton M, Kobayashi M, Terashima S, Ohyanagi H, Copetti D, Hernández-Hernández T, Zhang J, Ohmido N, Fujita M, Toyoda A, Ikawa H, Fujiyama A, Furuumi H, Miyabayashi T, Kubo T, Kudrna D, Wing R, Yano K, Nonomura KI, Sato Y, Kurata N. Evolution and Diversity of the Wild Rice Oryza officinalis Complex, across Continents, Genome Types, and Ploidy Levels. Genome Biol Evol. 2020 Apr 1;12(4):413-428. doi: 10.1093/gbe/evaa037.

Abstract

The Oryza officinalis complex is the largest species group in Oryza, with more than nine species from four continents, and is a tertiary gene pool that can be exploited in breeding programs for the improvement of cultivated rice. Most diploid and tetraploid members of this group have a C genome. Using a new reference C genome for the diploid species O. officinalis, and draft genomes for two other C genome diploid species Oryza officinalis and Oryza rhizomatis, we examine the influence of transposable elements on genome structure and provide a detailed phylogeny and evolutionary history of the Oryza C genomes. The O. officinalis genome is 1.6 times larger than the A genome of cultivated Oryza sativa, mostly due to proliferation of Gypsy type long-terminal repeat transposable elements, but overall syntenic relationships are maintained with other Oryza genomes (A, B, and F). Draft genome assemblies of the two other C genome diploid species, Oryza officinalis and Oryza rhizomatis, and short-read resequencing of a series of other C genome species and accessions reveal that after the divergence of the C genome progenitor, there was still a substantial degree of variation within the C genome species through proliferation and loss of both DNA and long-terminal repeat transposable elements. We provide a detailed phylogeny and evolutionary history of the Oryza C genomes and a genomic resource for the exploitation of the Oryza tertiary gene pool.

Assembly statistics

Genome size584.1 Mb
Total ungapped length583.2 Mb
Number of chromosomes12
Number of scaffolds91
Scaffold N5049.5 Mb
Scaffold L505
Number of contigs9,873
Contig N50367.6 kb
Contig L50469
GC percent44
Genome coverage60.0x
Assembly levelScaffold

Assembly

The Oryza officinalis v1.0 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) GCA_008326285.1_Oryza_officinalis_v1.0_genomic.fna.gz

Gene Predictions

The Oryza officinalis v1.0 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) Ooffi_maker_gene_annotation.gff.gz
CDS sequences (FASTA file) Of_cds.fa.gz
Protein sequences (FASTA file) Of_pep.fa.gz

Functional Analysis

Functional annotation for the Oryza officinalis v1.0 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Oryza_officinalis.Pfam.tsv.gz

S genes

Summary

QueryScaffoldSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247I-SΨBDMV01000005.1454177027927411-7927560Olongistaminata74DUF247
DUF247II-SΨBDMV01000005.1454177027912446-7912805Osativa80DUF247
HPS10-SBDMV01000005.1454177027916092-7916194,
7916296-7916441
Osativa79-
DUF247II-ZΨBDMV01000004.14216232139534564-39534785TturgidumZ260DUF247
HPS10-ZBDMV01000004.14216232139528899-39529037,
39529266-39529336
LpsZ_chromosome268-

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences