Setaria italica v2.0 Assembly & Annotation

Overview

Analysis Name Setaria italica v2.0 Assembly & Annotation
Sequencing technology ABI 3739
Assembly method ARACHNE v. 2007101641HA
Release Date 2015-10-30
Reference Publication(s)

Bennetzen JL, Schmutz J, Wang H, Percifield R, Hawkins J, Pontaroli AC, Estep M, Feng L, Vaughn JN, Grimwood J, Jenkins J, Barry K, Lindquist E, Hellsten U, Deshpande S, Wang X, Wu X, Mitros T, Triplett J, Yang X, Ye CY, Mauro-Herrera M, Wang L, Li P, Sharma M, Sharma R, Ronald PC, Panaud O, Kellogg EA, Brutnell TP, Doust AN, Tuskan GA, Rokhsar D, Devos KM. Reference genome sequence of the model plant Setaria. Nat Biotechnol. 2012 May 13;30(6):555-61. doi: 10.1038/nbt.2196.

Abstract

We generated a high-quality reference genome sequence for foxtail millet (Setaria italica). The ∼400-Mb assembly covers ∼80% of the genome and >95% of the gene space. The assembly was anchored to a 992-locus genetic map and was annotated by comparison with >1.3 million expressed sequence tag reads. We produced more than 580 million RNA-Seq reads to facilitate expression analyses. We also sequenced Setaria viridis, the ancestral wild relative of S. italica, and identified regions of differential single-nucleotide polymorphism density, distribution of transposable elements, small RNA content, chromosomal rearrangement and segregation distortion. The genus Setaria includes natural and cultivated species that demonstrate a wide capacity for adaptation. The genetic basis of this adaptation was investigated by comparing five sequenced grass genomes. We also used the diploid Setaria genome to evaluate the ongoing genome assembly of a related polyploid, switchgrass (Panicum virgatum).

Assembly statistics

Genome size405.7 Mb
Total ungapped length400.9 Mb
Number of chromosomes9
Number of scaffolds336
Scaffold N5047.3 Mb
Scaffold L504
Number of contigs6,778
Contig N50126.3 kb
Contig L50982
GC percent46
Genome coverage7.0x
Assembly levelChromosome

Assembly

The Setaria italica v2.0 Assembly file is available in FASTA format.

Downloads

Chromosomes (FASTA file) GCA_000263155.2_Setaria_italica_v2.0_genomic.fna.gz

Gene Predictions

The Setaria italica v2.0 genome gene prediction files are available in GFF3 and FASTA format.

Downloads

Genes (GFF3 file) GCA_000263155.2_Setaria_italica_v2.0_genomic.gff.gz
CDS sequences (FASTA file) GCA_000263155.2_Setaria_italica_v2.0_cds_from_genomic.fna.gz
Protein sequences (FASTA file) GCA_000263155.2_Setaria_italica_v2.0_protein.faa.gz

Functional Analysis

Functional annotation for the Setaria italica v2.0 is available for download below. The proteins were analyzed using InterProScan to assign InterPro domains(Pfam).

Downloads

Domain from InterProScan Setaria_italica.Pfam.tsv.gz

S genes

Summary

QueryChromosomeSize(bp)CoordinatestBLASTn HittBLASTn %IDDomain
DUF247II-ZΨCM003534.13596431531472715-31473911Telongatum54DUF247
HPS10-ZCM003534.13596431531470711-31470894,31470997-31471085SspontaneumZ470-

Nucleotide

Protein

© 2023 National Genomics Data Center, China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences