Arabidopsis suecica ASS3 Assembly & Annotation

Overview

Analysis Name Arabidopsis suecica ASS3 Assembly & Annotation
Sequencing technology PacBio RSII, Illumina
Assembly method Falcon, Canu, quickmerge
Release Date 2021-01-28
Reference Publication(s)

Burns R, Mandáková T, Gunis J, Soto-Jiménez LM, Liu C, Lysak MA, Novikova PY, Nordborg M. Gradual evolution of allopolyploidy in Arabidopsis suecica. Nat Ecol Evol. 2021 Oct;5(10):1367-1381. doi: 10.1038/s41559-021-01525-w.

Abstract

Most diploid organisms have polyploid ancestors. The evolutionary process of polyploidization is poorly understood but has frequently been conjectured to involve some form of ‘genome shock’, such as genome reorganization and subgenome expression dominance. Here we study polyploidization in Arabidopsis suecica, a post-glacial allopolyploid species formed via hybridization of Arabidopsis thaliana and Arabidopsis arenosa. We generated a chromosome-level genome assembly of A. suecica and complemented it with polymorphism and transcriptome data from all species. Despite a divergence around 6 million years ago (Ma) between the ancestral species and differences in their genome composition, we see no evidence of a genome shock: the A. suecica genome is colinear with the ancestral genomes; there is no subgenome dominance in expression; and transposon dynamics appear stable. However, we find changes suggesting gradual adaptation to polyploidy. In particular, the A. thaliana subgenome shows upregulation of meiosis-related genes, possibly to prevent aneuploidy and undesirable homeologous exchanges that are observed in synthetic A. suecica, and the A. arenosa subgenome shows upregulation of cyto-nuclear processes, possibly in response to the new cytoplasmic environment of A. suecica, with plastids maternally inherited from A. thaliana. These changes are not seen in synthetic hybrids, and thus are likely to represent subsequent evolution.

Assembly statistics

Genome size262.6 Mb
Total ungapped length262.3 Mb
Number of scaffolds13
Scaffold N5019.6 Mb
Scaffold L506
Number of contigs291
Contig N509 Mb
Contig L5011
GC percent36
Genome coverage50.0x
Assembly levelScaffold

Assembly

The Arabidopsis suecica ASS3 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) genome.fa.gz

Gene Predictions

The Arabidopsis suecica ASS3 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) genome.gff3.gz
CDS sequences (FASTA file) Arabidopsis_suecica_ASS3.cds.fa.gz
Protein sequences (FASTA file) Arabidopsis_suecica_ASS3.protein.fa.gz

Functional Analysis

Functional annotation for the Arabidopsis suecica ASS3 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Arabidopsis_suecica_ASS3.Pfam.tsv.gz

S genes

Summary

QueryScaffoldSize(bp)CoordinatesBLASTp HitBLASTp %ID
SRKAsue_scaffold419,168,1487398946-7400242,7400502-7400636,7400726-7400959,
7401050-7401199,7401292-7401532,7401636-7401786,
7401863-7402198
sp|P0DH86|SRK_ARATH 99
SCRAsue_scaffold419,168,1489750167-9750236,9749807-9750060XP_00641446562

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences