Solanum cajamarquense PG6242 Assembly & Annotation

Overview

Analysis Name Solanum cajamarquense PG6242 Assembly & Annotation
Sequencing technology PacBio data and Hi-C data
Assembly method hifiasm (v.0.13)
Release Date 2022-06-08
Reference Publication(s)

Tang D, Jia Y, Zhang J, Li H, Cheng L, Wang P, Bao Z, Liu Z, Feng S, Zhu X, Li D, Zhu G, Wang H, Zhou Y, Zhou Y, Bryan GJ, Buell CR, Zhang C, Huang S. Genome evolution and diversity of wild and cultivated potatoes. Nature. 2022 Jun;606(7914):535-541. doi: 10.1038/s41586-022-04822-x.

Abstract

Potato (Solanum tuberosum L.) is the world’s most important non-cereal food crop, and the vast majority of commercially grown cultivars are highly heterozygous tetraploids. Advances in diploid hybrid breeding based on true seeds have the potential to revolutionize future potato breeding and production. So far, relatively few studies have examined the genome evolution and diversity of wild and cultivated landrace potatoes, which limits the application of their diversity in potato breeding. Here we assemble 44 high-quality diploid potato genomes from 24 wild and 20 cultivated accessions that are representative of Solanum section Petota, the tuber-bearing clade, as well as 2 genomes from the neighbouring section, Etuberosum. Extensive discordance of phylogenomic relationships suggests the complexity of potato evolution. We fnd that the potato genome substantially expanded its repertoire of disease-resistance genes when compared with closely related seed-propagated solanaceous crops, indicative of the efect of tuber-based propagation strategies on the evolution of the potato genome. We discover a transcription factor that determines tuber identity and interacts with the mobile tuberization inductive signal SP6A. We also identify 561,433 high-confdence structural variants and construct a map of large inversions, which provides insights for improving inbred lines and precluding potential linkage drag, as exemplifed by a 5.8-Mb inversion that is associated with carotenoid content in tubers. This study will accelerate hybrid potato breeding and enrich our understanding of the evolution and biology of potato as a global staple food crop.

Assembly statistics

Contig total length 1,498,428,098 bp
Contig number 4024
Contig N50 3,719,053 bp
Contig L50 69
Contig longest 39,335,337 bp
Assembly level Contig

Assembly

The Solanum cajamarquense PG6242 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) PG6242.fa.gz

Gene Predictions

The Solanum cajamarquense PG6242 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) PG6242.gff.gz
CDS sequences (FASTA file) PG6242.cds.fa.gz
Protein sequences (FASTA file) PG6242.protein.fa.gz

Functional Analysis

Functional annotation for the Solanum cajamarquense PG6242 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Solanum_cajamarquense_PG6242.Pfam.tsv.gz

S genes

Summary

QueryContigSize(bp)CoordinatesBLASTn HitBLASTn %IDDomain
SLF10Ψatg021837654229785-28589Solanum chilense KJ814888.1, SLF1091.1-
SLF9Ψatg060854762467974-66833Solanum tuberosum DM8.1, SLF997.6-
SLF21atg06401013887745623-746843Solanum tuberosum DM8.1, SLF2196.5F-box domain
SLF5atg0674512392244035-242845Solanum tuberosum DM8.1, SLF596.1F-box domain
SLF12atg0674512392273776-274942Solanum tuberosum DM8.1, SLF1298.6F-box domain
SLF4atg0674512392495434-494268Solanum pimpinellifolium
KJ814871.1, SLF4
96.4F-box domain
SLF19atg081876456491815-92930Solanum tuberosum DM8.1, SLF1996F-box domain
SLF18atg0818764564117714-116599Solanum tuberosum DM8.1, SLF1896.3F-box domain
SLF6Ψatg105171967164899-63758Solanum tuberosum DM8.1, SLF6-294.8-
SLF13atg1051719671371969-370767Solanum tuberosum DM8.1, SLF1398F-box domain
SLF6-2atg25852520220346-19204Solanum tuberosum DM8.1, SLF693.2F-box domain
SLF6-3hptg00381351273535466-534324Solanum tuberosum DM8.1, SLF693.2F-box domain
SLF5-2Ψhptg007716247955784-54613Solanum tuberosum DM8.1, SLF5-297.5-
SLF7Ψhptg0077162479154995-153833Solanum tuberosum DM8.1, SLF795.6-
S-RNasehptg01102747417433886-434119,
434206-434619
Solanum tuberosum MZ561415.1,
SRNase-S12
95.1Ribonuclease T2 family
SLF17hptg011027474172317341-2318501Solanum tuberosum DM8.1, SLF1795.6F-box domain
SLF6-4hptg01941003670508385-507240Solanum tuberosum DM8.1, SLF697F-box domain
SLF17-2hptg01941003670801653-800472Solanum tuberosum DM8.1, SLF1797.2F-box domain
SLF22Ψhptg02341176391211210-210058Solanum tuberosum DM8.1, SLF22-297.7-
S-RNase-2hptg023411763911116073-1115837,
1115749-1115336
Solanum tuberosu XM_006347185.1,
RNase1
98.2Ribonuclease T2 family
SLF18-2ptg00371372326813580936-13582051Solanum tuberosum DM8.1, SLF1896.3F-box domain
SLF19-2ptg00371372326813603309-13602194Solanum tuberosum DM8.1, SLF1995.6F-box domain
SLF15ptg004461869682676982-2678241Solanum tuberosum DM8.1, SLF1597F-box domain
SLF16Ψptg004461869683952147-3950967Solanum tuberosum DM8.1, SLF1698.4-
SLF5-3ptg0064666358351059-349875Solanum tuberosum DM8.1, SLF597.2F-box domain
SLF12-2ptg0064666358377246-378412Solanum tuberosum DM8.1, SLF1298.5F-box domain
SLF5-4Ψptg006580659062847-1679Solanum tuberosum DM8.1, SLF5-298-
SLF7-2ptg0065806590660676-59513Solanum tuberosum DM8.1, SLF795.5F-box domain
SLF20Ψptg0065806590672300-71134Solanum tuberosum DM8.1, SLF2096.2-
SLF9-2Ψptg006580659061642956-1641816Solanum tuberosum DM8.1, SLF997.5-
SLF10-2Ψptg006580659062435656-2434458Solanum chilense KJ814888.1, SLF1091.1-
SLF6-5Ψptg006580659064880661-4879520Solanum tuberosum DM8.1, SLF6-295-
SLF13-2ptg006580659065420136-5418934Solanum tuberosum DM8.1, SLF1398F-box domain

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences