Oropetium thomaeum V2 Assembly & Annotation

Overview

Analysis Name Oropetium thomaeum V2 Assembly & Annotation
Sequencing technology Pacbio, Illumina, Hi-C
Assembly method Canu (V1.4)
Release Date 2018-11-15
Reference Publication(s)

VanBuren R, Wai CM, Keilwagen J, Pardo J. A chromosome-scale assembly of the model desiccation tolerant grass Oropetium thomaeum. Plant Direct. 2018 Nov 15;2(11):e00096. doi: 10.1002/pld3.96.

Abstract

Oropetium thomaeum is an emerging model for desiccation tolerance and genome size evolution in grasses. A draft genome of Oropetium was recently sequenced, but the lack of a chromosome-scale assembly has hindered comparative analyses and downstream functional genomics. Here, we reassembled Oropetium, and anchored the genome into 10 chromosomes using high-throughput chromatin conformation capture (Hi-C) based chromatin interactions. A combination of high-resolution RNAseq data and homology-based gene prediction identified thousands of new, conserved gene models that were absent from the V1 assembly. This includes thousands of new genes with high expression across a desiccation timecourse. Comparison between the Sorghum and Oropetium genomes revealed a surprising degree of chromosome-level collinearity, and several chromosome pairs have near perfect synteny. Other chromosomes are collinear in the gene rich chromosome arms but have experienced pericentric translocations. Together, these resources will be useful for the grass-comparative genomic community and further establish Oropetium as a model resurrection plant.

Assembly statistics

Number of contigs436
Contig N502.02 Mb
Scaffold N5020.5 Mb
Total assembly size236 Mb
Gene models28,835
BUSCO98.9%
Assembly levelChromosome

Assembly

The Oropetium thomaeum V2 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Oropetium_thomaeum.faa.gz

Gene Predictions

The Oropetium thomaeum V2 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) Oropetium_thomaeum.gff.gz
CDS sequences (FASTA file) Ot_cds.fa.gz
Protein sequences (FASTA file) Ot_pep.fa.gz

Functional Analysis

Functional annotation for the Oropetium thomaeum V2 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Oropetium_thomaeum.Pfam.tsv.gz

S genes

Summary

Query?Size(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247II-SΨ23128988313312260-13313252Shybrid59DUF247
HPS10-S23128988313336971-13337044,
13337440-13337542
ShybridS172-
DUF247I-ZΨ62056820719269559-19270131Pvaginatum80DUF247
DUF247II-ZΨ62056820719265373-19265732Ttriandra61DUF247
HPS10-Z62056820719267731-19267840,
19268031-19268136
Lperrieri65-

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences