Brassica rapa Chiifu_V4.0 Assembly & Annotation

Overview

Analysis Name Brassica rapa Chiifu_V4.0 Assembly & Annotation
Sequencing technology ONT, Hi-C
Assembly method NextDenovo v2.5
Release Date 2023-06-27
Reference Publication(s)

Zhang L, Liang J, Chen H, Zhang Z, Wu J, Wang X. A near-complete genome assembly of Brassica rapa provides new insights into the evolution of centromeres. Plant Biotechnol J. 2023 May;21(5):1022-1032. doi: 10.1111/pbi.14015.

Summary

Brassica rapa comprises many important cultivated vegetables and oil crops. However, Chiifu v3.0, the current B. rapa reference genome, still contains hundreds of gaps. Here, we presented a near-complete genome assembly of B. rapa Chiifu v4.0, which was 424.59 Mb with only two gaps, using Oxford Nanopore Technology (ONT) ultralong-read sequencing and Hi-C technologies. The new assembly contains 12 contigs, with a contig N50 of 38.26 Mb. Eight of the ten chromosomes were entirely reconstructed in a single contig from telomere to telomere. We found that the centromeres were mainly invaded by ALE and CRM long terminal repeats (LTRs). Moreover, there is a high divergence of centromere length and sequence among B. rapa genomes. We further found that centromeres are enriched for Copia invaded at 0.14 MYA on average, while pericentromeres are enriched for Gypsy LTRs invaded at 0.51 MYA on average. These results indicated the different invasion mechanisms of LTRs between the two structures. In addition, a novel repetitive sequence PCR630 was identified in the pericentromeres of B. rapa. Overall, the near-complete genome assembly, B. rapa Chiifu v4.0, offers valuable tools for genomic and genetic studies of Brassica species and provides new insights into the evolution of centromeres.

Assembly statistics

Estimated genome size (Mb)455
Assembly size (Mb)424.59
Contig number12
Contig N50 (kb)38 257
Gap-free chromosome number8
Gaps number2
Completeness (% BUSCO)99.40
LTR assembly index score15.05
Assembly levelNear-complete

Assembly

The Brassica rapa Chiifu_V4.0 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Brapa_chiifu_v41_genome20230413.fasta.gz

Gene Predictions

The Brassica rapa Chiifu_V4.0 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) Brapa_chiifu_v41_gene20230413.gff3.gz
CDS sequences (FASTA file) Brapa_chiifu_v41_gene20230413.gff3.cds.fa.gz
Protein sequences (FASTA file) Brapa_chiifu_v41_gene20230413.gff3.pep.fa.gz

Functional Analysis

Functional annotation for the Brassica rapa Chiifu_V4.0 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Brassica_rapa.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatesBLASTp HitBLASTp %ID
SRKA073832061432624729-32626049,32626680-32626826,
32626924-32627105,32631508-32631722,
32631821-32632034,32632142-32632211,
32632400-32632695
sp|Q09092|SRK6_BRAOV65.69
SCRA073832061432618152-32618068,32617971-32617778BAD29945.173.6

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences