Oryza granulata OGRA Assembly & Annotation

Overview

Analysis Name Oryza granulata OGRA Assembly & Annotation
Sequencing technology Illumina HiSeq 2000
Assembly method AllPaths version 52488
Release Date 2019-03-19
Reference Publication(s)

Shi C, Li W, Zhang QJ, Zhang Y, Tong Y, Li K, Liu YL, Gao LZ. The draft genome sequence of an upland wild rice species, Oryza granulata. Sci Data. 2020 Apr 29;7(1):131. doi: 10.1038/s41597-020-0470-2.

Abstract

Exploiting novel gene sources from wild relatives has proven to be an efficient approach to advance crop genetic breeding efforts. Oryza granulata, with the GG genome type, occupies the basal position of the Oryza phylogeny and has the second largest genome (~882 Mb). As an upland wild rice species, it possesses renowned traits that distinguish it from other Oryza species, such as tolerance to shade and drought, immunity to bacterial blight and resistance to the brown planthopper. Here, we generated a 736.66-Mb genome assembly of O. granulata with 40,131 predicted protein-coding genes. With Hi-C data, for the first time, we anchored ~98.2% of the genome assembly to the twelve pseudo-chromosomes. This chromosome-length genome assembly of O. granulata will provide novel insights into rice genome evolution, enhance our efforts to search for new genes for future rice breeding programmes and facilitate the conservation of germplasm of this endangered wild rice species.

Assembly statistics

Genome size (bp) 736,660,308
GC content 45.87%
Scaffold sequence No.2,393
Maximum scaffold sequence length (bp)4,040,447
Minimum scaffold sequence length (bp)952
Average scaffold sequence length (bp)307,840
Scaffold N50 (bp)916,335
Scaffold N90 (bp)239,778
Assembly levelScaffold

Assembly

The Oryza granulata OGRA Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) GWHAAKB00000000.genome.fasta.gz

Gene Predictions

The Oryza granulata OGRA genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) GWHAAKB00000000.gff.gz
CDS sequences (FASTA file) GWHAAKB00000000.RNA.fasta.gz
Protein sequences (FASTA file) GWHAAKB00000000.Protein.faa.gz

Functional Analysis

Functional annotation for the Oryza granulata OGRA is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Oryza_granulata.Pfam.tsv.gz

S genes

Summary

QueryScaffoldSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247I-ZΨGWHAAKB0000001820882021002990-1003484Amyosuroides67DUF247
DUF247II-ZΨGWHAAKB00000018208820299562-100059AsativaDUF247II-Z169DUF247
HPS10-ZGWHAAKB000000182088202989371-989426,
989509-989683
Bhybridum_HPS10-Z48-

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences