Brassica napus Da-Ae Assembly & Annotation

Overview

Analysis Name Brassica napus Da-Ae Assembly & Annotation
Sequencing technology PacBio Sequel; Illumina HiSeq
Assembly method Canu v. 1.6; Pilon v. 1.22; HiRise v. FEB-2018
Release Date 2021-10-08
Reference Publication(s)

Davis JT, Li R, Kim S, Michelmore R, Kim S, Maloof JN. Whole-genome sequence of synthetically derived Brassica napus inbred cultivar Da-Ae. G3 (Bethesda). 2023 Apr 11;13(4):jkad026. doi: 10.1093/g3journal/jkad026.

Abstract

Brassica napus, a globally important oilseed crop, is an allotetraploid hybrid species with two subgenomes originating from Brassica rapa and Brassica oleracea. The presence of two highly similar subgenomes has made the assembly of a complete draft genome challenging and has also resulted in natural homoeologous exchanges between the genomes, resulting in variations in gene copy number, which further complicates assigning sequences to correct chromosomes. Despite these challenges, high-quality draft genomes of this species have been released. Using third generation sequencing and assembly technologies, we generated a new genome assembly for the synthetic B. napus cultivar Da-Ae. Through the use of long reads, linked-reads, and Hi-C proximity data, we assembled a new draft genome that provides a high-quality reference genome of a synthetic B. napus. In addition, we identified potential hotspots of homoeologous exchange between subgenomes within Da-Ae, based on their presence in other independently derived lines. The occurrence of these hotspots may provide insight into the genetic rearrangements required for B. napus to be viable following the hybridization of B. rapa and B. oleracea.

Assembly statistics

Genome size1 Gb
Total ungapped length1 Gb
Number of chromosomes19
Number of scaffolds3,164
Scaffold N5048.2 Mb
Scaffold L50 9
Number of contigs4,004
Contig N501.6 Mb
Contig L50177
GC percent37
Genome coverage100.0x
Assembly levelChromosome

Assembly

The Brassica napus Da-Ae Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) GCA_020379485.1_Da-Ae_genomic.fna.gz

Gene Predictions

The Brassica napus Da-Ae genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) GCA_020379485.1_Da-Ae_genomic.gff.gz
CDS sequences (FASTA file) GCA_020379485.1_Da-Ae_translated_cds.faa.gz
Protein sequences (FASTA file) GCA_020379485.1_Da-Ae_protein.faa.gz

Functional Analysis

Functional annotation for the Brassica napus Da-Ae is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Brassica_napus_Da-Ae.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatesBLASTp HitBLASTp %ID
SRK1NC_063434.1 (A1)3096341620540509-20541805,20540041-20540426,
20539748-20539958,20539421-20539658,
20539186-20539336,20538803-20539096
SRKb|AB052756.1_prot_BAB40987.1_134
SCR1NC_063434.1 (A1)3096341620220145-20220214,20220313-20220533XP_018438641.187
SRK2NC_063449.1 (C6)4820979739370728-39372048,39372677-39372817,
39372915-39373093,39376096-39376306,
39376398-39376635,39376760-39376910,
39376997-39377326
sp|Q09092|SRK6_BRAOV67
SCR2NC_063449.1 (C6)4820979739308868-39308953,39308656-39308806BAD29945.180

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences