Analysis Name | Arabidopsis arenosa UiO_Aaren_v1.0 Assembly & Annotation |
Sequencing technology | PacBio Sequel; Illumina HiSeq |
Assembly method | Canu v. 2.1 |
Release Date | 2022-11-14 |
Bramsiepe J, Krabberød AK, Bjerkan KN, Alling RM, Johannessen IM, Hornslien KS, Miller JR, Brysting AK, Grini PE. Structural evidence for MADS-box type I family expansion seen in new assemblies of Arabidopsis arenosa and A. lyrata. Plant J. 2023 Nov;116(3):942-961. doi: 10.1111/tpj.16401.
SUMMARYArabidopsis thaliana diverged from A. arenosa and A. lyrata at least 6 million years ago. The three species differ by genome-wide polymorphisms and morphological traits. The species are to a high degree reproductively isolated, but hybridization barriers are incomplete. A special type of hybridization barrier is based on the triploid endosperm of the seed, where embryo lethality is caused by endosperm failure to support the developing embryo. The MADS-box type I family of transcription factors is specifically expressed in the endosperm and has been proposed to play a role in endosperm-based hybridization barriers. The gene family is well known for its high evolutionary duplication rate, as well as being regulated by genomic imprinting. Here we address MADS-box type I gene family evolution and the role of type I genes in the context of hybridization. Using two de-novo assembled and annotated chromosome-level genomes of A. arenosa and A. lyrata ssp. petraea we analyzed the MADS-box type I gene family in Arabidopsis to predict orthologs, copy number, and structural genomic variation related to the type I loci. Our findings were compared to gene expression profiles sampled before and after the transition to endosperm cellularization in order to investigate the involvement of MADS-box type I loci in endosperm-based hybridization barriers. We observed substantial differences in type-I expression in the endosperm of A. arenosa and A. lyrata ssp. petraea, suggesting a genetic cause for the endosperm-based hybridization barrier between A. arenosa and A. lyrata ssp. petraea.
Assembly statistics
Genome size | 153 Mb |
Total ungapped length | 152.9 Mb |
Number of chromosomes | 8 |
Number of scaffolds | 264 |
Scaffold N50 | 19.2 Mb |
Scaffold L50 | 4 |
Number of contigs | 403 |
Contig N50 | 6.5 Mb |
Contig L50 | 8 |
GC percent | 36 |
Genome coverage | 78.0x |
Assembly level | Chromosome |
The Arabidopsis arenosa UiO_Aaren_v1.0 Assembly file is available in FASTA format.
Downloads
Chromosomes (FASTA file) | Arabidopsis_arenosa_genome.softmasked.fna.gz |
The Arabidopsis arenosa UiO_Aaren_v1.0 genome gene prediction files are available in GFF3 and FASTA format.
Downloads
Genes (GFF3 file) | A.arenosa.gff.gz |
CDS sequences (FASTA file) | A_arenosa.cds.fa.gz |
Protein sequences (FASTA file) | A_arenosa.pep.fa.gz |
Functional annotation for the Arabidopsis arenosa UiO_Aaren_v1.0 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).
Downloads
Domain from InterProScan | Arabidopsis_arenosa_UiO_Aaren_v1.0.Pfam.tsv.gz |
Summary
Query | Chromosome | Size(bp) | Coordinates | BLASTp Hit | BLASTp %ID |
SRK | scaffold_7 | 21434880 | 8227742-8229044,8229931-8230065,8230163-8230344,8230422-8230632,8230720-8230957,8231047-8231197,8231284-8231613 | spP0DH86SRK_ARATH | 65 |
SCR | scaffold_7 | 21434880 | 11026945-11027014,11026539-11026768 | XP_006414465.1 | 78 |
Nucleotide
Protein