Analysis Name | Rosa multiflora RMU_r2.0 Assembly & Annotation |
Sequencing technology | Illumina HiSeq2000, Illumina MiSeq |
Assembly method | SOAPdenovo v. 2-rev240, GapCloser v. 1.10,L_RNA_scaffolder |
Release Date | 2017-10-13 |
Nakamura N, Hirakawa H, Sato S, Otagaki S, Matsumoto S, Tabata S, Tanaka Y. Genome structure of Rosa multiflora, a wild ancestor of cultivated roses. DNA Res. 2018 Apr 1;25(2):113-121. doi: 10.1093/dnares/dsx042.
AbstractThe draft genome sequence of a wild rose (Rosa multiflora Thunb.) was determined using Illumina MiSeq and HiSeq platforms. The total length of the scaffolds was 739,637,845 bp, consisting of 83,189 scaffolds, which was close to the 711 Mbp length estimated by k-mer analysis. N50 length of the scaffolds was 90,830 bp, and extent of the longest was 1,133,259 bp. The average GC content of the scaffolds was 38.9%. After gene prediction, 67,380 candidates exhibiting sequence homology to known genes and domains were extracted, which included complete and partial gene structures. This large number of genes for a diploid plant may reflect heterogeneity of the genome originating from self-incompatibility in R. multiflora. According to CEGMA analysis, 91.9% and 98.0% of the core eukaryotic genes were completely and partially conserved in the scaffolds, respectively. Genes presumably involved in flower color, scent and flowering are assigned. The results of this study will serve as a valuable resource for fundamental and applied research in the rose, including breeding and phylogenetic study of cultivated roses.
Assembly statistics
Genome size | 739.6 Mb |
Total ungapped length | 659.6 Mb |
Number of scaffolds | 83,189 |
Scaffold N50 | 90.8 kb |
Scaffold L50 | 2,234 |
Number of contigs | 142,789 |
Contig N50 | 16.9 kb |
Contig L50 | 10,806 |
GC percent | 39 |
Genome coverage | 327.0x |
Assembly level | Scaffold |
The Rosa multiflora RMU_r2.0 Assembly file is available in FASTA format.
Downloads
Chromosomes (FASTA file) | RMU_r2.0.genome.gz |
The Rosa multiflora RMU_r2.0 genome gene prediction files are available in GFF3 and FASTA format.
Downloads
Genes (GFF3 file) | RMU_r2.0.gff3.gz |
CDS sequences (FASTA file) | RMU_r2.0.cds.gz |
Protein sequences (FASTA file) | RMU_r2.0.pep.gz |
Functional annotation for the Rosa multiflora RMU_r2.0 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).
Downloads
Domain from InterProScan | Rosa_multiflora_RMU_r2.0.Pfam.tsv.gz |
Rosa S genes Nucleotide
Rosa S genes Protein