Analysis Name | Brassica napus Da-Ae Assembly & Annotation |
Sequencing technology | PacBio Sequel; Illumina HiSeq |
Assembly method | Canu v. 1.6; Pilon v. 1.22; HiRise v. FEB-2018 |
Release Date | 2021-10-08 |
Davis JT, Li R, Kim S, Michelmore R, Kim S, Maloof JN. Whole-genome sequence of synthetically derived Brassica napus inbred cultivar Da-Ae. G3 (Bethesda). 2023 Apr 11;13(4):jkad026. doi: 10.1093/g3journal/jkad026.
AbstractBrassica napus, a globally important oilseed crop, is an allotetraploid hybrid species with two subgenomes originating from Brassica rapa and Brassica oleracea. The presence of two highly similar subgenomes has made the assembly of a complete draft genome challenging and has also resulted in natural homoeologous exchanges between the genomes, resulting in variations in gene copy number, which further complicates assigning sequences to correct chromosomes. Despite these challenges, high-quality draft genomes of this species have been released. Using third generation sequencing and assembly technologies, we generated a new genome assembly for the synthetic B. napus cultivar Da-Ae. Through the use of long reads, linked-reads, and Hi-C proximity data, we assembled a new draft genome that provides a high-quality reference genome of a synthetic B. napus. In addition, we identified potential hotspots of homoeologous exchange between subgenomes within Da-Ae, based on their presence in other independently derived lines. The occurrence of these hotspots may provide insight into the genetic rearrangements required for B. napus to be viable following the hybridization of B. rapa and B. oleracea.
Assembly statistics
Genome size | 1 Gb |
Total ungapped length | 1 Gb |
Number of chromosomes | 19 |
Number of scaffolds | 3,164 |
Scaffold N50 | 48.2 Mb |
Scaffold L50 | 9 |
Number of contigs | 4,004 |
Contig N50 | 1.6 Mb |
Contig L50 | 177 |
GC percent | 37 |
Genome coverage | 100.0x |
Assembly level | Chromosome |
The Brassica napus Da-Ae Assembly file is available in FASTA format.
Downloads
Chromosomes (FASTA file) | GCA_020379485.1_Da-Ae_genomic.fna.gz |
The Brassica napus Da-Ae genome gene prediction files are available in GFF3 and FASTA format.
Downloads
Genes (GFF3 file) | GCA_020379485.1_Da-Ae_genomic.gff.gz |
CDS sequences (FASTA file) | GCA_020379485.1_Da-Ae_translated_cds.faa.gz |
Protein sequences (FASTA file) | GCA_020379485.1_Da-Ae_protein.faa.gz |
Functional annotation for the Brassica napus Da-Ae is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).
Downloads
Domain from InterProScan | Brassica_napus_Da-Ae.Pfam.tsv.gz |
Summary
Query | Chromosome | Size(bp) | Coordinates | BLASTp Hit | BLASTp %ID |
SRK1 | NC_063434.1 (A1) | 30963416 | 20540509-20541805,20540041-20540426,20539748-20539958,20539421-20539658,20539186-20539336,20538803-20539096 | SRKb|AB052756.1_prot_BAB40987.1_1 | 34 |
SCR1 | NC_063434.1 (A1) | 30963416 | 20220145-20220214,20220313-20220533 | XP_018438641.1 | 87 |
SRK2 | NC_063449.1 (C6) | 48209797 | 39370728-39372048,39372677-39372817,39372915-39373093,39376096-39376306,39376398-39376635,39376760-39376910,39376997-39377326 | sp|Q09092|SRK6_BRAOV | 67 |
SCR2 | NC_063449.1 (C6) | 48209797 | 39308868-39308953,39308656-39308806 | BAD29945.1 | 80 |
Nucleotide
Protein