Solanum scabrum gwh_assembly PI 643126 01 Assembly & Annotation

Overview

Analysis Name Solanum scabrum gwh_assembly PI 643126 01 Assembly & Annotation
Sequencing technology PacBio HiFi
Assembly method Hifiasm v0.16.1-r375
Release Date 2023-03-21
Reference Publication(s)

Wu Y, Li D, Hu Y, Li H, Ramstein GP, Zhou S, Zhang X, Bao Z, Zhang Y, Song B, Zhou Y, Zhou Y, Gagnon E, Särkinen T, Knapp S, Zhang C, Städler T, Buckler ES, Huang S. Phylogenomic discovery of deleterious mutations facilitates hybrid potato breeding. Cell. 2023 May 25;186(11):2313-2328.e15. doi: 10.1016/j.cell.2023.04.008.

Summary

Hybrid potato breeding will transform the crop from a clonally propagated tetraploid to a seed-reproducing diploid. Historical accumulation of deleterious mutations in potato genomes has hindered the development of elite inbred lines and hybrids. Utilizing a whole-genome phylogeny of 92 Solanaceae and its sister clade species, we employ an evolutionary strategy to identify deleterious mutations. The deep phylogeny reveals the genome-wide landscape of highly constrained sites, comprising ∼2.4% of the genome. Based on a diploid potato diversity panel, we infer 367,499 deleterious variants, of which 50% occur at non-coding and 15% at synonymous sites. Counterintuitively, diploid lines with relatively high homozygous deleterious burden can be better starting material for inbred-line development, despite showing less vigorous growth. Inclusion of inferred deleterious mutations increases genomic-prediction accuracy for yield by 24.7%. Our study generates insights into the genome-wide incidence and properties of deleterious mutations and their far-reaching consequences for breeding.

Assembly statistics

Genome size (bp) 3,006,187,379
GC content 36.64%
Contig sequence No. 5,331
Maximum contig sequence length (bp) 8,620,688
Minimum contig sequence length (bp) 8,433
Average contig sequence length (bp) 563,907
Contig N50 (bp) 1,416,492
Contig N90 (bp) 330,622
Assembly level Contig

Assembly

The Solanum scabrum gwh_assembly PI 643126 01 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) GWHBKCC00000000.genome.fasta.gz

Gene Predictions

The Solanum scabrum gwh_assembly PI 643126 01 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) GWHBKCC00000000.gff.gz
CDS sequences (FASTA file) GWHBKCC00000000.CDS.fasta.gz
Protein sequences (FASTA file) GWHBKCC00000000.Protein.faa.gz

Functional Analysis

Functional annotation for the Solanum scabrum gwh_assembly PI 643126 01 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Solanum_scabrum_gwh_PI_643126_01.Pfam.tsv.gz

S genes

Summary

QueryContigSize(bp)CoordinatesBLASTn HitBLASTn %IDDomain
SLF18GWHBKCC00000122742059656892-658010Solanum tuberosum DM8.1, SLF1887.9 F-box domain
SLF19GWHBKCC00000122742059699859-698759Solanum tuberosum DM8.1, SLF1990.8 F-box domain
SLF21ΨGWHBKCC0000012516965731229261-1230474Solanum tuberosum DM8.1, SLF2187.0 -
SLF22ΨGWHBKCC000003832312416142223-141087Solanum tuberosum DM8.1, SLF2286.7 -
SLF11GWHBKCC000004883285942744293-745462Solanum tuberosum DM8.1, SLF1188.9 F-box domain
SLFGWHBKCC0000049029066072803694-2804851Solanum chilense KF475783.1, SLF-ZF5-187.2 F-box domain
SLF21-2ΨGWHBKCC000008009493499441-10654Solanum tuberosum DM8.1, SLF2186.8 -
SLF5GWHBKCC00000976664347223554-222385Solanum tuberosum DM8.1, SLF5-287.2 F-box domain
SLF7GWHBKCC0000103944824293067016-3068185Solanum tuberosum DM8.1, SLF788.2 F-box domain
SLF16ΨGWHBKCC00001053257340199317-200505Solanum tuberosum DM8.1, SLF1687.6 -
SLF19-2GWHBKCC0000106355113227372-28469Solanum tuberosum DM8.1, SLF1988.6 F-box domain
SLF19-3GWHBKCC0000106355113272181-73278Solanum tuberosum DM8.1, SLF1988.6 F-box domain
SLF19-4GWHBKCC00001063551132116975-118072Solanum tuberosum DM8.1, SLF1988.6 F-box domain
SLF19-5GWHBKCC00001063551132157539-158636Solanum tuberosum DM8.1, SLF1988.5 F-box domain
SLF18-2GWHBKCC00001063551132204141-203026Solanum lycopersicum SL2.31, SLF1888.5 F-box domain
SLF22-2ΨGWHBKCC00001340980973343348-342212Solanum tuberosum DM8.1, SLF2287.3 -
SLF19-6ΨGWHBKCC0000143247962130966-29887Solanum tuberosum DM8.1, SLF1988.6 -
SLF15ΨGWHBKCC00001636125040751886-53145Solanum tuberosum DM8.1, SLF1590.3 -
SLF16-2ΨGWHBKCC000016361250407639089-637906Solanum tuberosum DM8.1, SLF1688.4 -
SLF19-7ΨGWHBKCC00001712102568454219-55320Solanum tuberosum DM8.1, SLF1990.2 -
SLF18-3GWHBKCC000017121025684101373-100264Solanum tuberosum DM8.1, SLF1888.8 F-box domain
SLF11-2GWHBKCC000020541127140673008-671839Solanum tuberosum DM8.1, SLF1189.3 F-box domain
SLF22-3ΨGWHBKCC0000208412684711119745-1120880Solanum tuberosum DM8.1, SLF2287.1 -
SLF19-8GWHBKCC000022845906478144-7047Solanum tuberosum DM8.1, SLF1988.6 F-box domain
SLF19-9GWHBKCC0000228459064757021-55924Solanum tuberosum DM8.1, SLF1988.6 F-box domain
SLF19-10GWHBKCC00002284590647158436-157339Solanum tuberosum DM8.1, SLF1988.0 F-box domain
SLF15-2GWHBKCC00002430913695268218-269477Solanum tuberosum DM8.1, SLF1590.2 F-box domain
SLF15-3GWHBKCC000027039183355844-54585Solanum tuberosum DM8.1, SLF1590.3 F-box domain
SLF16-3GWHBKCC00002711751851625331-624147Solanum tuberosum DM8.1, SLF1688.0 F-box domain

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences