Oryza sativa MSU7.0 Assembly & Annotation

Overview

Analysis Name Oryza sativa MSU7.0 Assembly & Annotation
Sequencing technology Illumina
Release Date 2013-02-06
Reference Publication(s)

Kawahara Y, de la Bastide M, Hamilton JP, Kanamori H, McCombie WR, Ouyang S, Schwartz DC, Tanaka T, Wu J, Zhou S, Childs KL, Davidson RM, Lin H, Quesada-Ocampo L, Vaillancourt B, Sakai H, Lee SS, Kim J, Numa H, Itoh T, Buell CR, Matsumoto T. Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice (N Y). 2013 Feb 6;6(1):4. doi: 10.1186/1939-8433-6-4.

Abstract

Background: Rice research has been enabled by access to the high quality reference genome sequence generated in 2005 by the International Rice Genome Sequencing Project (IRGSP). To further facilitate genomic-enabled research, we have updated and validated the genome assembly and sequence for the Nipponbare cultivar of Oryza sativa (japonica group).

Results: The Nipponbare genome assembly was updated by revising and validating the minimal tiling path of clones with the optical map for rice. Sequencing errors in the revised genome assembly were identified by re-sequencing the genome of two different Nipponbare individuals using the Illumina Genome Analyzer II/IIx platform. A total of 4,886 sequencing errors were identified in 321 Mb of the assembled genome indicating an error rate in the original IRGSP assembly of only 0.15 per 10,000 nucleotides. A small number (five) of insertions/deletions were identified using longer reads generated using the Roche 454 pyrosequencing platform. As the re-sequencing data were generated from two different individuals, we were able to identify a number of allelic differences between the original individual used in the IRGSP effort and the two individuals used in the re-sequencing effort. The revised assembly, termed Os-Nipponbare-Reference-IRGSP-1.0, is now being used in updated releases of the Rice Annotation Project and the Michigan State University Rice Genome Annotation Project, thereby providing a unified set of pseudomolecules for the rice community.

Conclusions: A revised, error-corrected, and validated assembly of the Nipponbare cultivar of rice was generated using optical map data, re-sequencing data, and manual curation that will facilitate on-going and future research in rice. Detection of polymorphisms between three different Nipponbare individuals highlights that allelic differences between individuals should be considered in diversity studies.

Assembly statistics

Genome size 373.8 Mb
Number of chromosomes 12
Number of scaffolds 55
Scaffold N50 30 Mb
Scaffold L50 6
Number of contigs 302
Contig N50 7.7 Mb
Contig L50 17
Assembly level Chromosome

Assembly

The Oryza sativa MSU7.0 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Os_genome.fa.gz

Gene Predictions

The Oryza sativa MSU7.0 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) Os.gff.gz
CDS sequences (FASTA file) Os_cds.fa.gz
Protein sequences (FASTA file) Os_pep.fa.gz

Functional Analysis

Functional annotation for the Oryza sativa MSU7.0 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Oryza_sativa_MSU7.0.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247I-SΨChr5299584346058829-6059311LpSDUF247-I_chromosome176DUF247
DUF247II-SChr5299584346046665-6048326LpSDUF247-II_chromosome168DUF247
HPS10-SChr5299584346050205-6050364,
6050467-6050603
LpsS_contig1102938-
DUF247I-ZΨChr43550269432941990-32942343AatlanticaDUF247I-Z74DUF247
DUF247II-ZΨChr43550269432949311-32950072Psupina Chr4 772DUF247
HPS10-ZChr43550269432946909-32947006,
32947136-32947307
AerianthaHPS10-Z35-

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences