SPEED2

Gene signatures

SPEED2 is a signaling pathway annotation enrichment analysis tool with gene sets derived from manually curated pathway perturbation experiments in human cell lines. Identifying modulated pathways upstream of differentially expressed genes can facilitate the understanding of involved regulatory mechanisms. Currently, only human genes and pathways are supported. SPEED2 gene sets are extracted from publically available microarray data from the GEO database, which have been selected for pathway specific perturbations. Each experiment is normalised using z-value transformation as previously described (Parikh et al, 2010). z-values are rescaled for each experiment to the interval [-1,1] (which we term Zrank). For each pathway, the average Zrank for each gene across the experiments (Nexp) is calculated, and its significance is assessed by a p-value, using the average of Nexp uniformly distributed variables on [-1,1] as null model (Bates distribution). Those ranked lists covering all measured genes are then used to assess the enrichment or depletion of a user-provided gene list in the Run SPEED2 analysis.

Although in the Run SPEED2 analysis all measured genes are included in each pathway (just different positioning) we can use the data set to also extract a pathway-specific signature. For this we only considered the 300 most significant genes per pathway if their adjusted p-value (Qval) was below 0.05 and provide those signatures in the table below. Regulation shows the direction in which the gene is regulated when the pathway is activated. Nexp shows the number of experiments in which this gene was measured. Zrank provides the average signed rank for this gene across those experiments. Pval and Qval denote the p-value and adjusted p-value for this gene derived from the Bates distribution.

You can select pathways, genes and regulation to filter the table, or download the entire table as a .cvs file by pressing the CVS button.

Pathway	Gene	Regulation	Nexp	Zrank	Pval	Qval
Pathway	Gene	Regulation	Nexp	Zrank	Pval	Qval

SPEED2

SPEED2 is a signaling pathway enrichment analysis tool with pathway signatures derived from a large number of manually curated pathway perturbation experiments across many different cell types in human cells. Genes are scored based on causal influences of pathway perturbations as opposed to pathway memberships. Here you can perform an enrichment analysis on a list of human genes against the SPEED2 gene signature database.

For testing you can select an example gene set from the MSigDB Hallmark sets:

Run SPEED2

Genes to investigate
Paste a list of gene ids (either Entrez Gene IDs or gene symbols (HGNC), separated by space, tabs, line breaks or commas, maximally 500):

Background gene list (optionally)
By default, all genes in the SPEED2 database are background.

Test statistics for enrichment

Bates test (recommended) ^ⓘ
Chi2-test ^ⓘ

Signature genes

Note that the submitted list was tested against the transcriptome-wide ranked gene lists per pathway. In order to extract representative candidates we have also generated gene signatures consisting of the top 300 most significant genes per pathway (provided enough genes were significant). Below you will find the overlap of those signatures with the user-submitted list which you can use to derive representatives for the top regulated pathways. All signature genes can be found on the page Gene signatures

Pathway	Gene	Regulation	Nexp	Zrank	Pval	Qval
Pathway	Gene	Regulation	Nexp	Zrank	Pval	Qval

Pathways

In SPEED2, we considered 16 signalling pathways, named after the stimulus or important signalling mediators: Estrogen, H2O2, Hippo, Hypoxia, IL-1, Insulin, JAK-STAT, MAPK+PI3K, TLR, Notch, PPAR, p53, TGFb, TNFa, VEGF and Wnt.

Gene signatures, i.e. genes that are significantly regulated by the pathways, can have a strong overlap between pathways, as signalling pathways are known to show cross-talk.

The figure below shows the rank spearman correlation of mutually significant genes for all selected pathways (after filtering out genes with P > 0.05).

About SPEED2: Signalling Pathway Enrichment using Expression Data set version 2

SPEED2 is a follow up of our previous web service SPEED (https://speed.sys-bio.net/). This new version contains 16 instead of 11 pathways, and is based on three times more experimental data sets. It also uses different statistical methods to derive and score signatures.

How are SPEED2 gene signatures derived?

In short, for each pathway we identify perturbation experiments in the GEO microarray database and calculate rescaled z-scores of differential expression for each gene across all corresponding experiments (N), and an associated p-value, using the average of N uniformly distributed variables on [-1,1] as null model, the so-called Bates distribution. We compute corresponding false discovery rates (FDRs) and q-values of the obtained p-values by randomly shuffling the rescaled z-values of each experiment before averaging z-values across experiments (1000 times). Details and the signature can be found on the "Gene Signatures" page.

How are SPEED2 gene signatures scored?

By default, SPEED considers all genes with expression values in the SPEED database as the background. Optionally, a user can provide a gene list in a text file as the background set (in the same format as above). There should be no comments or headers in the file.
These gene lists are then scored using the ranks for each pathway, using two different methods to score enrichment:
Bates statistics: Scores if the average rank deviates from zero. This score is suitable if the provided gene list contains genes that are either up regulated or down regulated. As a result, the average score is shown, and the bars are colored according to the significance (FDR adjusted p-value using the Bates distribution).
Chi2 statistics: Scores if the rank variance deviate significantly from a uniform distribution. This score is suitable if the provided gene list contains genes that are both up and down regulated. As a result, the increase in the mean squared ranks compared to a random gene set is shown and the bars are colored according to the significance (FDR-adjusted p-value using the Chi2 distribution).

Download raw data and source code

To perform any custom analysis you can use the R-package speed2, which is available at https://github.com/molsysbio/speed2. To install it, load it and get instruction use the following commands in R:


library(devtools)

install_github("molsysbio/speed2")

library(speed2)

vignette("speed2")

You can download the processed or raw SPEED 2 data that was used to derive the signatures. A list of experiments that was used to derive SPEED 2 signatures is available here: list of experiments and the signatures are available here as tab-delimited files: SPEED2 signatures. The scripts to reproduce the figures of the SPEED2 paper can be found here: create_figures.tgz

Contact information

Nils Blüthgen

How to cite

If you use SPEED2 for your own research, please cite the following article:

Mattias Rydenfelt, Bertram Klinger, Martina Klünemann, Nils Blüthgen, SPEED2: inferring upstream pathway activity from differential gene expression, Nucleic Acids Research (2020), gkaa236 Link

Changelog

Date X: Original publication

Help - how to use SPEED2

SPEED2 performs an enrichment analysis for signalling pathway target/signature genes in a user provided list of genes.
To run SPEED2, select Run SPEED2 in the menue bar, and provide the following 3 information:

Genes to investigate
A list of the human genes of interest, for instance differentially expressed genes from an experiment. These gene lists can be lists of either Ensembl gene IDs or Gene Symbols, seperated by white space, tab, comma, or new line. To try example gene lists, You can select examples from the menu above. These are signatures of the MSigDB hallmarks list.
Background gene list (optionally)
It is often useful to restrice the analysis to genes that have been measured. For instance, genes that are on a microarray, or genes that are expressed in a certain experiment. These gene lists can be uploaded as file, again as Ensembl gene IDs or Gene Symbols, seperated by white space, tab, comma, or new line. If not specified, all genes in the genome are considered for the statistical tests below..
Test statistics for enrichment
Select the statistical test. Use Bates test if the query list contains only upregulated or only downregulated genes, use Chi2-test if the query list contains both up- and downregulated genes.

Subsequently, press Run analysis to perform the SPEED2 analysis. First, a progress wheel will appear below, followed by the results after a 1-2 seconds.
The results are displayed in two ways:

First, a graphic shows the pathways ranked by significance. A bar graph shows the effect size (depending on the chosen test: For bates test, the mean rank of the supplied gene group in each signature is shown. For Chi2 test, an increase/decrease in variance of the rank distribution is shown). A download link allows to download the results as comma separated file for further analysis
Second, a table shows genes that are significat signature genes for different pathways. This table can be ordered by column by clicking on the respective table headers, and also downloaded by clicking on the CSV button.