Prunus dulcis Texas Genome v2.0 Assembly & Annotation

Overview

Analysis Name Prunus dulcis Texas Genome v2.0 Assembly & Annotation
Sequencing technology Illumina PE libary and Oxford Nanopore reads
Assembly method MaSuRCA (v3.2.3)
Release Date 2018-10-11
Reference Publication(s)

Alioto T, Alexiou KG, Bardil A, Barteri F, Castanera R, Cruz F, Dhingra A, Duval H, Fernández I Martí Á, Frias L, Galán B, García JL, Howad W, Gómez-Garrido J, Gut M, Julca I, Morata J, Puigdomènech P, Ribeca P, Rubio Cabetas MJ, Vlasova A, Wirthensohn M, Garcia-Mas J, Gabaldón T, Casacuberta JM, Arús P. Transposons played a major role in the diversification between the closely related almond and peach genomes: results from the almond genome sequence. Plant J. 2020 Jan;101(2):455-472. doi: 10.1111/tpj.14538.

Summary

We sequenced the genome of the highly heterozygous almond Prunus dulcis cv. Texas combining short- and long-read sequencing. We obtained a genome assembly totaling 227.6 Mb of the estimated almond genome size of 238 Mb, of which 91% is anchored to eight pseudomolecules corresponding to its haploid chromosome complement, and annotated 27 969 protein-coding genes and 6747 non-coding transcripts. By phylogenomic comparison with the genomes of 16 additional close and distant species we estimated that almond and peach (Prunus persica) diverged around 5.88 million years ago. These two genomes are highly syntenic and show a high degree of sequence conservation (20 nucleotide substitutions per kb). However, they also exhibit a high number of presence/absence variants, many attributable to the movement of transposable elements (TEs). Transposable elements have generated an important number of presence/absence variants between almond and peach, and we show that the recent history of TE movement seems markedly different between them. Transposable elements may also be at the origin of important phenotypic differences between both species, and in particular for the sweet kernel phenotype, a key agronomic and domestication character for almond. Here we show that in sweet almond cultivars, highly methylated TE insertions surround a gene involved in the biosynthesis of amygdalin, whose reduced expression has been correlated with the sweet almond phenotype. Altogether, our results suggest a key role of TEs in the recent history and diversification of almond and its close relative peach.

Assembly statistics

Genome size227.6 Mb
Total ungapped length223.7 Mb
Number of chromosomes8
Number of organelles1
Number of scaffolds691
Scaffold N5024.4 Mb
Scaffold L504
Number of contigs4,395
Contig N50115.2 kb
Contig L50511
GC percent37.5
Genome coverage800.0x
Assembly levelChromosome

Assembly

The Prunus dulcis Texas Genome v2.0 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) pdulcis26.chromosomes.fasta.gz

Gene Predictions

The Prunus dulcis Texas Genome v2.0 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) Prudul26A.chromosomes.gff3.gz
CDS sequences (FASTA file) Prudul26A.cds.fa.gz
Protein sequences (FASTA file) Prudul26A.pep.fa.gz

Functional Analysis

Functional annotation for the Prunus dulcis Texas Genome v2.0 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Prunus_dulcis_Texas_Genome_v2.0.Pfam.tsv.gz

S genes

Prunus S genes Nucleotide

Prunus S genes Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences