Rubus occidentalis v3.0 Assembly & Annotation

Overview

Analysis Name Rubus occidentalis v3.0 Assembly & Annotation
Sequencing technology Illumina NextSeq
Assembly method Proximo Hi-C scaffolding pipeline
Release Date 2018-04-30
Reference Publication(s)

VanBuren R, Wai CM, Colle M, Wang J, Sullivan S, Bushakra JM, Liachko I, Vining KJ, Dossett M, Finn CE, Jibran R, Chagné D, Childs K, Edger PP, Mockler TC, Bassil NV. A near complete, chromosome-scale assembly of the black raspberry (Rubus occidentalis) genome. Gigascience. 2018 Aug 1;7(8):giy094. doi: 10.1093/gigascience/giy094.

Abstract

Background: The fragmented nature of most draft plant genomes has hindered downstream gene discovery, trait mapping for breeding, and other functional genomics applications. There is a pressing need to improve or finish draft plant genome assemblies.
Findings: Here, we present a chromosome-scale assembly of the black raspberry genome using single-molecule real-time Pacific Biosciences sequencing and high-throughput chromatin conformation capture (Hi-C) genome scaffolding. The updated V3 assembly has a contig N50 of 5.1 Mb, representing an ∼200-fold improvement over the previous Illumina-based version. Each of the 235 contigs was anchored and oriented into seven chromosomes, correcting several major misassemblies. Black raspberry V3 contains 47 Mb of new sequences including large pericentromeric regions and thousands of previously unannotated protein-coding genes. Among the new genes are hundreds of expanded tandem gene arrays that were collapsed in the Illumina-based assembly. Detailed comparative genomics with the high-quality V4 woodland strawberry genome (Fragaria vesca) revealed near-perfect 1:1 synteny with dramatic divergence in tandem gene array composition. Lineage-specific tandem gene arrays in black raspberry are related to agronomic traits such as disease resistance and secondary metabolite biosynthesis.
Conclusions: The improved resolution of tandem gene arrays highlights the need to reassemble these highly complex and biologically important regions in draft plant genomes. The updated, high-quality black raspberry reference genome will be useful for comparative genomics across the horticulturally important Rosaceae family and enable the development of marker assisted breeding in Rubus.

Assembly statistics

Assembly

The Rubus occidentalis v3.0 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Rubus_occ_V3_10-12-17.fasta.gz

Gene Predictions

The Rubus occidentalis v3.0 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) Rubus_occ_V3.genes.gff3.gz
CDS sequences (FASTA file) Rubus_occ_V3.transcripts.fasta.gz
Protein sequences (FASTA file) Rubus_occ_V3.proteins.fasta.gz

Functional Analysis

Functional annotation for the Rubus occidentalis v3.0 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Rubus_occidentalis_v3.0.Pfam.tsv.gz
© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences