Lolium perenne MPB_Lper_Kyuss_1697 Assembly & Annotation

Overview

Analysis Name Lolium perenne MPB_Lper_Kyuss_1697 Assembly & Annotation
Sequencing technology Oxford Nanopore PromethION; Illumina NovaSeq
Assembly method Flye v. 2.7.1-b1590
Release Date 2021-07-28
Reference Publication(s)

Frei D, Veekman E, Grogg D, Stoffel-Studer I, Morishima A, Shimizu-Inatsugi R, Yates S, Shimizu KK, Frey JE, Studer B, Copetti D. Ultralong Oxford Nanopore Reads Enable the Development of a Reference-Grade Perennial Ryegrass Genome Assembly. Genome Biol Evol. 2021 Aug 3;13(8):evab159. doi: 10.1093/gbe/evab159.

Abstract

Despite the progress made in DNA sequencing over the last decade, reconstructing telomere-to-telomere genome assemblies of large and repeat-rich eukaryotic genomes is still difficult. More accurate basecalls or longer reads could address this issue, but no current sequencing platform can provide both simultaneously. Perennial ryegrass (Lolium perenne L.) is an example of an important species for which the lack of a reference genome assembly hindered a swift adoption of genomics-based methods into breeding programs. To fill this gap, we optimized the Oxford Nanopore Technologies’ sequencing protocol, obtaining sequencing reads with an N50 of 62 kb—a very high value for a plant sample. The assembly of such reads produced a highly complete (2.3 of 2.7 Gb), correct (QV 45), and contiguous (contig N50 and N90 11.74 and 3.34 Mb, respectively) genome assembly. We show how read length was key in determining the assembly contiguity. Sequence annotation revealed the dominance of transposable elements and repeated sequences (81.6% of the assembly) and identified 38,868 protein coding genes. Almost 90% of the bases could be anchored to seven pseudomolecules, providing the first high-quality haploid reference assembly for perennial ryegrass. This protocol will enable producing longer Oxford Nanopore Technology reads for more plant samples and ushering forage grasses into modern genomics-assisted breeding programs.

Assembly statistics

Genome size 2.3 Gb
Number of chromosomes 7
Number of scaffolds 1,677
Scaffold N50 275.2 Mb
Scaffold L50 4
Number of contigs 1,915
Contig N50 11.1 Mb
Contig L50 66
Assembly level Chromosome

Assembly

The Lolium perenne MPB_Lper_Kyuss_1697 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) Kyuss_1697_assembly.fa.gz

Gene Predictions

The Lolium perenne MPB_Lper_Kyuss_1697 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) Kyuss_1697_KYUS.gff.gz
CDS sequences (FASTA file) Kyuss_1697_KYUS_CDS.fa.gz
Protein sequences (FASTA file) Kyuss_1697_KYUS_proteins.fa.gz

Functional Analysis

Functional annotation for the Lolium perenne MPB_Lper_Kyuss_1697 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Lolium_perenne_MPB_Lper_Kyuss_1697.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247I-Schr126343439538450637-38452229LpSDUF247-I_chromosome1100DUF247
DUF247II-Schr126343439538193663-38195315LpSDUF247-II_chromosome1100DUF247
HPS10-Schr126343439538449393-38449510,
38449580-38449713
LpsS_chromosome1100-
DUF247I-Zchr2347306123332860175-332861764LpZDUF247-I_chromosome2100DUF247
DUF247II-Zchr2347306123332799119-332800780LpZDUF247-II_chromosome2100DUF247
HPS10-Zchr2347306123332813093-332813199,
332813281-332813425
LpsZ_chromosome2100-

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences