Analysis Name | Solanum lycopersicum gwh Tao_Lin Assembly & Annotation |
Sequencing technology | PacBio |
Assembly method | CANU version1.3 |
Release Date | 2022-01-06 |
Su X, Wang B, Geng X, Du Y, Yang Q, Liang B, Meng G, Gao Q, Yang W, Zhu Y, Lin T. A high-continuity and annotated tomato reference genome. BMC Genomics. 2021 Dec 15;22(1):898. doi: 10.1186/s12864-021-08212-xdoi: 10.1186/s12864-021-08212-x.
AbstractBackground Genetic and functional genomics studies require a high-quality genome assembly. Tomato (Solanum lycopersicum), an important horticultural crop, is an ideal model species for the study of fruit development. Results Here, we assembled an updated reference genome of S. lycopersicum cv. Heinz 1706 that was 799.09 Mb in length, containing 34,384 predicted protein-coding genes and 65.66% repetitive sequences. By comparing the genomes of S. lycopersicum and S. pimpinellifolium LA2093, we found a large number of genomic fragments probably associated with human selection, which may have had crucial roles in the domestication of tomato. We also used a recombinant inbred line (RIL) population to generate a high-density genetic map with high resolution and accuracy. Using these resources, we identified a number of candidate genes that were likely to be related to important agronomic traits in tomato. Conclusion Our results offer opportunities for understanding the evolution of the tomato genome and will facilitate the study of genetic mechanisms in tomato biology.
Assembly statistics
Genome size (bp) | 799,091,949 |
Chromosomes sequence No. | 12 |
Genome sequence No. | 83 |
Maximum genome sequence length (bp) | 91,866,112 |
Minimum genome sequence length (bp) | 16,018 |
Average genome sequence length (bp) | 9,627,614 |
Genome sequence N50 (bp) | 66,166,780 |
Genome sequence N90 (bp) | 54,674,139 |
Assembly level | Chromosome |
The Solanum lycopersicum gwh Tao_Lin Assembly file is available in FASTA format.
Downloads
Chromosomes (FASTA file) | GWHBAUD00000000.genome.fasta.gz |
The Solanum lycopersicum gwh Tao_Lin genome gene prediction files are available in GFF3 and FASTA format.
Downloads
Genes (GFF3 file) | GWHBAUD00000000.gff.gz |
CDS sequences (FASTA file) | GWHBAUD00000000.CDS.fasta.gz |
Protein sequences (FASTA file) | GWHBAUD00000000.Protein.faa.gz |
Functional annotation for the Solanum lycopersicum gwh Tao_Lin is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).
Downloads
Domain from InterProScan | Solanum_lycopersicum_gwh_Tao_Lin.Pfam.tsv.gz |
Summary
Query | Chr | Size(bp) | Coordinates | BLASTn Hit | BLASTn %ID | Domain |
SLF15 | GWHBAUD00000001 | 91866112 | 2144101-2142842 | SL2.31ch01:2198500-2196501_SLF15 | 100 | F-box domain |
SLF16 | GWHBAUD00000001 | 91866112 | 2666068-2664887 | SL2.31ch01:2723400-2721301_SLF16 | 100 | F-box domain |
SLF17Ψ | GWHBAUD00000001 | 91866112 | 41862564-41861479 | SL2.31ch01:40853100-40851001_SLF17Ψ | 100 | - |
SLF1 | GWHBAUD00000001 | 91866112 | 44862998-44864167 | NM_001301439.2, SLF1 | 100 | F-box domain |
S-RNase | GWHBAUD00000001 | 91866112 | 45672592-4567235345672255-45671830 | XM_004229015.1, Ribonuclease S-3 | 100 | Ribonuclease T2 family |
SLF2Ψ | GWHBAUD00000001 | 91866112 | 46559497-46558316 | KJ814870.1, SLF2 | 100 | - |
SLF12Ψ | GWHBAUD00000001 | 91866112 | 46616065-46617196 | SL2.31ch01:45516501-45518600_SLF12Ψ | 100 | - |
SLF4Ψ | GWHBAUD00000001 | 91866112 | 46683367-46682201 | KJ814943.1, SLF4 | 100 | - |
SLF5Ψ | GWHBAUD00000001 | 91866112 | 46764066-46762898 | KJ814872.1, SLF5 | 100 | - |
SLF6Ψ | GWHBAUD00000001 | 91866112 | 46781602-46780457 | KJ814944.1, SLF6 | 100 | - |
SLF8Ψ | GWHBAUD00000001 | 91866112 | 47338932-47337764 | SL2.31ch01:46243000-46240701_SLF8Ψ | 100 | - |
SLF7Ψ | GWHBAUD00000001 | 91866112 | 47363711-47362614 | SL2.31ch01:46267800-46265701_SLF7Ψ | 100 | - |
SLF9 | GWHBAUD00000001 | 91866112 | 49531820-49530756 | NM_001329461.2, SLF9 | 100 | F-box domain |
SLF10Ψ | GWHBAUD00000001 | 91866112 | 49977014-49978245 | KJ814899.1, SLF10 | 100 | - |
SLF11 | GWHBAUD00000001 | 91866112 | 51928457-51929629 | KJ814877.1, SLF11 | 100 | F-box associated |
SLF12 | GWHBAUD00000001 | 91866112 | 53687508-53686345 | NM_001301441.1, SLF12 | 100 | F-box associated |
SLF13 | GWHBAUD00000001 | 91866112 | 54533010-54531817 | NM_001301435.1, SLF13 | 100 | F-box associated |
SLF14Ψ | GWHBAUD00000001 | 91866112 | 57743756-57742586 | KJ814903.1, SLF14 | 100 | - |
SLF18 | GWHBAUD00000001 | 91866112 | 68901807-68902922 | SL2.31ch01:67739501-67741500_SLF18 | 100 | F-box domain |
SLF19 | GWHBAUD00000001 | 91866112 | 68920831-68919722 | SL2.31ch01:67757501-67759600_SLF19 | 100 | F-box domain |
Nucleotide
Protein