Arabidopsis lyrata ALyr_1.0 Assembly & Annotation

Overview

Analysis Name Arabidopsis lyrata ALyr_1.0 Assembly & Annotation
Sequencing technology PacBio Sequel; Illumina HiSeq
Assembly method Canu v. 2.1
Release Date 2022-11-14
Reference Publication(s)

Bramsiepe J, Krabberød AK, Bjerkan KN, Alling RM, Johannessen IM, Hornslien KS, Miller JR, Brysting AK, Grini PE. Structural evidence for MADS-box type I family expansion seen in new assemblies of Arabidopsis arenosa and A. lyrata. Plant J. 2023 Nov;116(3):942-961. doi: 10.1111/tpj.16401.

SUMMARY

Arabidopsis thaliana diverged from A. arenosa and A. lyrata at least 6 million years ago. The three species differ by genome-wide polymorphisms and morphological traits. The species are to a high degree reproductively isolated, but hybridization barriers are incomplete. A special type of hybridization barrier is based on the triploid endosperm of the seed, where embryo lethality is caused by endosperm failure to support the developing embryo. The MADS-box type I family of transcription factors is specifically expressed in the endosperm and has been proposed to play a role in endosperm-based hybridization barriers. The gene family is well known for its high evolutionary duplication rate, as well as being regulated by genomic imprinting. Here we address MADS-box type I gene family evolution and the role of type I genes in the context of hybridization. Using two de-novo assembled and annotated chromosome-level genomes of A. arenosa and A. lyrata ssp. petraea we analyzed the MADS-box type I gene family in Arabidopsis to predict orthologs, copy number, and structural genomic variation related to the type I loci. Our findings were compared to gene expression profiles sampled before and after the transition to endosperm cellularization in order to investigate the involvement of MADS-box type I loci in endosperm-based hybridization barriers. We observed substantial differences in type-I expression in the endosperm of A. arenosa and A. lyrata ssp. petraea, suggesting a genetic cause for the endosperm-based hybridization barrier between A. arenosa and A. lyrata ssp. petraea.

Assembly statistics

Genome size187.5 Mb
Total ungapped length187.5 Mb
Number of chromosomes8
Number of scaffolds54
Scaffold N5021.8 Mb
Scaffold L504
Number of contigs459
Contig N502.1 Mb
Contig L5022
GC percent36
Genome coverage72.0x
Assembly levelChromosome

Assembly

The Arabidopsis lyrata ALyr_1.0 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Arabidopsis_lyrata_petraea_genome.fna.gz

Gene Predictions

The Arabidopsis lyrata ALyr_1.0 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) A.lyrata.gff.gz
CDS sequences (FASTA file) Alyra.cds.fa.gz
Protein sequences (FASTA file) Alyra.pep.fa.gz

Functional Analysis

Functional annotation for the Arabidopsis lyrata ALyr_1.0 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Arabidopsis_lyrata_ALyr_1.0.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatesBLASTp HitBLASTp %ID
SRKscaffold_7 226949419434460-9435762,9436251-9436415,
9436470-9436669,9436725-9436935,
9437019-9437256,9437350-9437500,
9437581-9437910
SRKb_AB052756.1_prot_BAB40987.198.0
SCRscaffold_7 226949419475859-9475985,9476004-9476179SCRb_AB052754.1_prot_BAB40985.163.6

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences