Rubus idaeus RiMJ Assembly & Annotation

Overview

Analysis Name Rubus idaeus RiMJ Assembly & Annotation
Sequencing technology Oxford Nanopore GridION; Illumina HiSeq
Assembly method NECAT v. 0.0.1_update20200803; RagTag v. 2.1.0
Release Date 2023-06-05
Reference Publication(s)

Price RJ, Davik J, Fernandéz Fernandéz F, Bates HJ, Lynn S, Nellist CF, Buti M, Røen D, Šurbanovski N, Alsheikh M, Harrison RJ, Sargent DJ. Chromosome-scale genome sequence assemblies of the 'Autumn Bliss' and 'Malling Jewel' cultivars of the highly heterozygous red raspberry (Rubus idaeus L.) derived from long-read Oxford Nanopore sequence data. PLoS One. 2023 May 16;18(5):e0285756. doi: 10.1371/journal.pone.0285756.

Abstract

Red raspberry (Rubus idaeus L.) is an economically valuable soft-fruit species with a relatively small (~300 Mb) but highly heterozygous diploid (2n = 2x = 14) genome. Chromosome-scale genome sequences are a vital tool in unravelling the genetic complexity controlling traits of interest in crop plants such as red raspberry, as well as for functional genomics, evolutionary studies, and pan-genomics diversity studies. In this study, we developed genome sequences of a primocane fruiting variety ('Autumn Bliss') and a floricane variety ('Malling Jewel'). The use of long-read Oxford Nanopore Technologies sequencing data yielded long read lengths that permitted well resolved genome sequences for the two cultivars to be assembled. The de novo assemblies of 'Malling Jewel' and 'Autumn Bliss' contained 79 and 136 contigs respectively, and 263.0 Mb of the 'Autumn Bliss' and 265.5 Mb of the 'Malling Jewel' assembly could be anchored unambiguously to a previously published red raspberry genome sequence of the cultivar 'Anitra'. Single copy ortholog analysis (BUSCO) revealed high levels of completeness in both genomes sequenced, with 97.4% of sequences identified in 'Autumn Bliss' and 97.7% in 'Malling Jewel'. The density of repetitive sequence contained in the 'Autumn Bliss' and 'Malling Jewel' assemblies was significantly higher than in the previously published assembly and centromeric and telomeric regions were identified in both assemblies. A total of 42,823 protein coding regions were identified in the 'Autumn Bliss' assembly, whilst 43,027 were identified in the 'Malling Jewel' assembly. These chromosome-scale genome sequences represent an excellent genomics resource for red raspberry, particularly around the highly repetitive centromeric and telomeric regions of the genome that are less complete in the previously published 'Anitra' genome sequence.

Assembly statistics

Genome size265.7 Mb
Total ungapped length265.7 Mb
Number of chromosomes7
Number of scaffolds11
Scaffold N5036.1 Mb
Scaffold L504
Number of contigs83
Contig N509.9 Mb
Contig L507
GC percent38
Genome coverage55.0x
Assembly levelChromosome

Assembly

The Rubus idaeus RiMJ Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) GCA_030142095.1_RiMJ_genomic.fna.gz

Gene Predictions

The Rubus idaeus RiMJ genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) RiMJ_ragtag_HiC.gff.gz
CDS sequences (FASTA file) RiMJ_ragtag_HiC_cds.fasta.gz
Protein sequences (FASTA file) RiMJ_ragtag_HiC_prot.fasta.gz

Functional Analysis

Functional annotation for the Rubus idaeus RiMJ is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Rubus_idaeus_Malling_Jewel.Pfam.tsv.gz
© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences