Welcome to GSEAserver

Welcome to GSEAServer

The web server for annotation and enrichment analysis of de novo assembled transcripts

Location: Home

The revolution and application of sequencing technologies in the past decade has generated copious amounts of data for in silico gene function analysis. Among them, RNA-Seq has been becoming a key technology in transcriptome analysis for both model, and more importantly, non-model organisms. Yet, the paucity and inequity of functional annotation is still a bottleneck for gene functional annotation and understanding how biological processes are organized, function, and evolved. Gene Set Enrichment analysis is a widely used method that automatically annotates gene function and mine knowledge from these high-throughput sequencing data. However, current available sequence annotation and functional Enrichment Analysis tools only work on well annotated model organisms that have reference genome information, which greatly limit their usage in the study of non-model organism species that lack annotated reference genome information, especially in plant kingdom, more than hundreds of species do not have an available draft genome or completed genome sequenced.

Here, we present GSEAserver, an online web-based and high performance pipeline to perform fast functional annotation and gene/transcript set enrichment analysis for denovo assembled transcripts from non-model plant. The main feature includes:

“on-the-fly” functional annotation of user uploaded de novo assembled transcripts by BLASTX-searching against closely related plant species in NCBI non-redundant (NR) protein databases (48 plant species included), which have been pre-annotated using the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) reference databases;
“on-the-fly” functional annotation of user uploaded de novo assembled transcripts on the basis of GO and KEGG reference databases
Accurately estimating the significance of over-representation of GO terms through random sampling statistic estimation. Compared with the Fisher's exact test and the hypergeometric test that are adopted in most of GSEA tools, the random sampling method is an accurate yet computationally intensive method, and is more suitable to analyze de novo transcriptome assemblies from RNA-Seq data. We resolved this issue by developing an ultra-fast parallel random sampling estimation algorithm to estimate the statistical significance of enriched gene sets.