Malus fusca Genome v1.0 Assembly & Annotation

Overview

Analysis Name Malus fusca Genome v1.0 Assembly & Annotation
Sequencing technology PacBio RSII
Assembly method HiFiasm v. 0.16.1
Release Date 2024-01-12
Reference Publication(s)

Mansfeld BN, Yocca A, Ou S, Harkess A, Burchard E, Gutierrez B, van Nocker S, Gottschalk C. A haplotype resolved chromosome-scale assembly of North American wild apple Malus fusca and comparative genomics of the fire blight Mfu10 locus. Plant J. 2023 Nov;116(4):989-1002. doi: 10.1111/tpj.16433.

SUMMARY

The Pacific crabapple (Malus fusca) is a wild relative of the commercial apple (Malus × domestica). With a range extending from Alaska to Northern California, M. fusca is extremely hardy and disease resistant. The species represents an untapped genetic resource for the development of new apple cultivars with enhanced stress resistance. However, gene discovery and utilization of M. fusca have been hampered by the lack of genomic resources. Here, we present a high-quality, haplotype-resolved, chromosome-scale genome assembly and annotation for M. fusca. The genome was assembled using high-fidelity long-reads and scaffolded using genetic maps and high-throughput chromatin conformation capture sequencing, resulting in one of the most contiguous apple genomes to date. We annotated the genome using public transcriptomic data from the same species taken from diverse plant structures and developmental stages. Using this assembly, we explored haplotypic structural variation within the genome of M. fusca, identifying thousands of large variants. We further showed high sequence co-linearity with other domesticated and wild Malus species. Finally, we resolve a known quantitative trait locus associated with resistance to fire blight (Erwinia amylovora). Insights gained from the assembly of a reference-quality genome of this hardy wild apple relative will be invaluable as a tool to facilitate DNA-informed introgression breeding.

Assembly statistics

Assembly

The Malus fusca Genome v1.0 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Mfusca_v1.0_hap1.soft.masked.fa.gz Mfusca_v1.0_hap2.soft.masked.fa.gz

Gene Predictions

The Malus fusca Genome v1.0 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) Mfusca_v1.0_hap1.gff.gz Mfusca_v1.0_hap2.gff.gz
CDS sequences (FASTA file) Mfusca_v1.0_hap1.CDS.fa.gz Mfusca_v1.0_hap2.CDS.fa.gz
Protein sequences (FASTA file) Mfusca_v1.0_hap1.proteins.fa.gz Mfusca_v1.0_hap2.proteins.fa.gz

Functional Analysis

Functional annotation for the Malus fusca Genome v1.0 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Mfusca_v1.0_hap1.Pfam.tsv.gz Mfusca_v1.0_hap2.Pfam.tsv.gz

S genes

Summary

QueryChrSize(bp)CoordinatesBLASTn HitBLASTn %IDDomain
SFBB.XVIChr173598244431946765-31947952MdSFBB.XVI-S999.24F-box; F_box_assoc
SFBB.XVIIChr173598244431982165-31980960MdSFBB.XVII-S998.92F-box; F_box_assoc
SFBB.XIVChr173598244431984434-31985639MdSFBB.XIV-S999.25F-box; F_box_assoc
SFBB.IbChr173598244432038832-32037627MdSFBB.Ib-S999.25F-box; F_box_assoc
SFBB.VIChr173598244432072753-32071575MdSFBB.VI-S997.54F-box; F_box_assoc
SFBB.IIIChr173598244432154207-32155388MdSFBB.III-S998.48F-box; F_box_assoc
SFBB.IIChr173598244432204855-32206048MdSFBB.II-S996.7F-box; F_box_assoc
SFBB.IVChr173598244432210333-32211517MdSFBB.IV-S997.63F-box; F_box_assoc
SFBB.XIChr173598244432272707-32271522MdSFBB.XI-S995.87F-box; F_box_assoc
SFBB.IaChr173598244432391453-32392655MdSFBB.Ib-S994.93F-box; F_box_assoc
SFBB.IcChr173598244432507823-32506621MdSFBB.Ia-S990.61F-box; F_box_assoc
SFBB.XIIChr173598244432663508-32664686MdSFBB.XII-S990.09F-box; F_box_assoc
SFBB.XIbChr173598244432771736-32770561MdSFBB.XI-S991.45F-box; F_box_assoc
SFBB.VChr173598244432922201-32921023MdSFBB.V-S997.37F-box; F_box_assoc
SFBB.VIIChr173598244433000342-32999164MdSFBB.VII-S995.42F-box; F_box_assoc
SFBB.XIXChr173598244433069853-33068621PbrSFBB.XIX-S1797.57F-box; F_box_assoc
SFBB.XVIIIChr173598244433090986-33089802MdSFBB.XVIII-S995.61F-box; F_box_assoc
SFBB.VIIIChr173598244433151217-33150027MdSFBB.VIII.1-S997.73F-box; F_box_assoc
SFBB.XXIChr173598244433974141-33975490MdSFBB.XXI-S998.74F-box; F_box_assoc
S-RNaseψChr173598244432679322-32679077,32678989-32678462MG598497.1, S11-RNase99.05-

Malus fusca Genome_v1.0 S genes Nucleotide

Malus fusca Genome_v1.0 S genes Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences