Analysis Name | Arabidopsis lyrata ALyr_1.0 Assembly & Annotation |
Sequencing technology | PacBio Sequel; Illumina HiSeq |
Assembly method | Canu v. 2.1 |
Release Date | 2022-11-14 |
Bramsiepe J, Krabberød AK, Bjerkan KN, Alling RM, Johannessen IM, Hornslien KS, Miller JR, Brysting AK, Grini PE. Structural evidence for MADS-box type I family expansion seen in new assemblies of Arabidopsis arenosa and A. lyrata. Plant J. 2023 Nov;116(3):942-961. doi: 10.1111/tpj.16401.
SUMMARYArabidopsis thaliana diverged from A. arenosa and A. lyrata at least 6 million years ago. The three species differ by genome-wide polymorphisms and morphological traits. The species are to a high degree reproductively isolated, but hybridization barriers are incomplete. A special type of hybridization barrier is based on the triploid endosperm of the seed, where embryo lethality is caused by endosperm failure to support the developing embryo. The MADS-box type I family of transcription factors is specifically expressed in the endosperm and has been proposed to play a role in endosperm-based hybridization barriers. The gene family is well known for its high evolutionary duplication rate, as well as being regulated by genomic imprinting. Here we address MADS-box type I gene family evolution and the role of type I genes in the context of hybridization. Using two de-novo assembled and annotated chromosome-level genomes of A. arenosa and A. lyrata ssp. petraea we analyzed the MADS-box type I gene family in Arabidopsis to predict orthologs, copy number, and structural genomic variation related to the type I loci. Our findings were compared to gene expression profiles sampled before and after the transition to endosperm cellularization in order to investigate the involvement of MADS-box type I loci in endosperm-based hybridization barriers. We observed substantial differences in type-I expression in the endosperm of A. arenosa and A. lyrata ssp. petraea, suggesting a genetic cause for the endosperm-based hybridization barrier between A. arenosa and A. lyrata ssp. petraea.
Assembly statistics
Genome size | 187.5 Mb |
Total ungapped length | 187.5 Mb |
Number of chromosomes | 8 |
Number of scaffolds | 54 |
Scaffold N50 | 21.8 Mb |
Scaffold L50 | 4 |
Number of contigs | 459 |
Contig N50 | 2.1 Mb |
Contig L50 | 22 |
GC percent | 36 |
Genome coverage | 72.0x |
Assembly level | Chromosome |
The Arabidopsis lyrata ALyr_1.0 Assembly file is available in FASTA format.
Downloads
Chromosomes (FASTA file) | Arabidopsis_lyrata_petraea_genome.fna.gz |
The Arabidopsis lyrata ALyr_1.0 genome gene prediction files are available in GFF3 and FASTA format.
Downloads
Genes (GFF3 file) | A.lyrata.gff.gz |
CDS sequences (FASTA file) | Alyra.cds.fa.gz |
Protein sequences (FASTA file) | Alyra.pep.fa.gz |
Functional annotation for the Arabidopsis lyrata ALyr_1.0 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).
Downloads
Domain from InterProScan | Arabidopsis_lyrata_ALyr_1.0.Pfam.tsv.gz |
Summary
Query | Chromosome | Size(bp) | Coordinates | BLASTp Hit | BLASTp %ID |
SRK | scaffold_7 | 22694941 | 9434460-9435762,9436251-9436415, 9436470-9436669,9436725-9436935,9437019-9437256,9437350-9437500,9437581-9437910 | SRKb_AB052756.1_prot_BAB40987.1 | 98.0 |
SCR | scaffold_7 | 22694941 | 9475859-9475985,9476004-9476179 | SCRb_AB052754.1_prot_BAB40985.1 | 63.6 |
Nucleotide
Protein