Analysis Name | Brassica nigra Ni100-LR Assembly & Annotation |
Sequencing technology | ONT, Illumina, CHiCAGO and Hi-C |
Assembly method | correction: Canu, assembly: SMARTDenovo, wtdbg, Miniasm, Hi-C: HiCRise |
Release Date | 2020-08-10 |
Perumal S, Koh CS, Jin L, Buchwaldt M, Higgins EE, Zheng C, Sankoff D, Robinson SJ, Kagale S, Navabi ZK, Tang L, Horner KN, He Z, Bancroft I, Chalhoub B, Sharpe AG, Parkin IAP. A high-contiguity Brassica nigra genome localizes active centromeres and defines the ancestral Brassica genome. Nat Plants. 2020 Aug;6(8):929-941. doi: 10.1038/s41477-020-0735-y.
AbstractIt is only recently, with the advent of long-read sequencing technologies, that we are beginning to uncover previously uncharted regions of complex and inherently recursive plant genomes. To comprehensively study and exploit the genome of the neglected oilseed Brassica nigra, we generated two high-quality nanopore de novo genome assemblies. The N50 contig lengths for the two assemblies were 17.1 Mb (12 contigs), one of the best among 324 sequenced plant genomes, and 0.29 Mb (424 contigs), respectively, reflecting recent improvements in the technology. Comparison with a de novo short-read assembly corroborated genome integrity and quantified sequence-related error rates (0.2%). The contiguity and coverage allowed unprecedented access to low-complexity regions of the genome. Pericentromeric regions and coincidence of hypomethylation enabled localization of active centromeres and identified centromere-associated ALE family retro-elements that appear to have proliferated through relatively recent nested transposition events (<1 Ma). Genomic distances calculated based on synteny relationships were used to define a post-triplication Brassica-specific ancestral genome, and to calculate the extensive rearrangements that define the evolutionary distance separating B. nigra from its diploid relatives.
The Brassica nigra Ni100-LR Assembly file is available in FASTA format.
Downloads
Chromosomes (FASTA file) | Bnigra_NI100.v2.genome.fasta.gz |
The Brassica nigra Ni100-LR genome gene prediction files are available in GFF3 and FASTA format.
Downloads
Genes (GFF3 file) | Bnigra_NI100.v2.genes.gff3.gz |
CDS sequences (FASTA file) | Bnigra_NI100.v2.cds.fasta.gz |
Protein sequences (FASTA file) | Bnigra_NI100.v2.pep.fasta.gz |
Functional annotation for the Brassica nigra Ni100-LR is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).
Downloads
Domain from InterProScan | Brassica_nigra_NI100.Pfam.tsv.gz |
Summary
Query | Chromosome | Size(bp) | Coordinates | BLASTp Hit | BLASTp %ID |
SRK | B7 | 58094628 | 50427380-50428652,50427181-50427291,50426962-50427098,50426681-50426891,50426352-50426589,50426122-50426272,50425739-50426041 | SRKb|AB052756.1_prot_BAB40987.1 | 39.6 |
SCR | B7 | 58094628 | 50230099-50230168,50230264-50230484 | XP_006301215.1 | 68.18 |
Nucleotide
Protein