Brassica nigra Ni100-LR Assembly & Annotation

Overview

Analysis Name Brassica nigra Ni100-LR Assembly & Annotation
Sequencing technology ONT, Illumina, CHiCAGO and Hi-C
Assembly method correction: Canu, assembly: SMARTDenovo, wtdbg, Miniasm, Hi-C: HiCRise
Release Date 2020-08-10
Reference Publication(s)

Perumal S, Koh CS, Jin L, Buchwaldt M, Higgins EE, Zheng C, Sankoff D, Robinson SJ, Kagale S, Navabi ZK, Tang L, Horner KN, He Z, Bancroft I, Chalhoub B, Sharpe AG, Parkin IAP. A high-contiguity Brassica nigra genome localizes active centromeres and defines the ancestral Brassica genome. Nat Plants. 2020 Aug;6(8):929-941. doi: 10.1038/s41477-020-0735-y.

Abstract

It is only recently, with the advent of long-read sequencing technologies, that we are beginning to uncover previously uncharted regions of complex and inherently recursive plant genomes. To comprehensively study and exploit the genome of the neglected oilseed Brassica nigra, we generated two high-quality nanopore de novo genome assemblies. The N50 contig lengths for the two assemblies were 17.1 Mb (12 contigs), one of the best among 324 sequenced plant genomes, and 0.29 Mb (424 contigs), respectively, reflecting recent improvements in the technology. Comparison with a de novo short-read assembly corroborated genome integrity and quantified sequence-related error rates (0.2%). The contiguity and coverage allowed unprecedented access to low-complexity regions of the genome. Pericentromeric regions and coincidence of hypomethylation enabled localization of active centromeres and identified centromere-associated ALE family retro-elements that appear to have proliferated through relatively recent nested transposition events (<1 Ma). Genomic distances calculated based on synteny relationships were used to define a post-triplication Brassica-specific ancestral genome, and to calculate the extensive rearrangements that define the evolutionary distance separating B. nigra from its diploid relatives.

Assembly

The Brassica nigra Ni100-LR Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Bnigra_NI100.v2.genome.fasta.gz

Gene Predictions

The Brassica nigra Ni100-LR genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) Bnigra_NI100.v2.genes.gff3.gz
CDS sequences (FASTA file) Bnigra_NI100.v2.cds.fasta.gz
Protein sequences (FASTA file) Bnigra_NI100.v2.pep.fasta.gz

Functional Analysis

Functional annotation for the Brassica nigra Ni100-LR is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Brassica_nigra_NI100.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatesBLASTp HitBLASTp %ID
SRKB75809462850427380-50428652,50427181-50427291,
50426962-50427098,50426681-50426891,
50426352-50426589,50426122-50426272,
50425739-50426041
SRKb|AB052756.1_prot_BAB40987.139.6
SCRB75809462850230099-50230168,50230264-50230484XP_006301215.168.18

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences