Loading...

Welcome to Granatum! This is a graphical single-cell RNA-seq (scRNA-seq) analysis pipeline for genomics scientists. The pipeline will graphically guide you through the analysis of scRNA-seq data, starting from expression and metadata tables. It uses a comprehensive set of modules for quality control / normalization, clustering, differential gene expression / enrichment analysis, protein network interaction visualization, and cell pseudo-time pathway construction.

Note 1: please do not click your browser's "Back" button. To restart the pipeline, click your browser's "Refresh" button.

Note 2: depending on dataset size, some steps may take time. Please allow computations to complete even if your browser appears to hang.

Upload


Is your data Human or Mouse? Make a selection under "Species". Then provide your Expression and Metadata tables as comman separated value files.


Before uploading your data, please refer to our format specification.

If you would like to add more datasets, click Add another dataset on the next page.


Summary of datasets uploaded

Last dataset uploaded

Batch-effect removal

Remove confounding effects from data generated in batches. Box plots give expression statistics for a random sampling of up to 96 cells. Select a batch grouping label (factor) then click "Remove batch effect". If multiple datasets were separately uploaded, the "dataset" factor can be used.



Outlier removal

Remove unusual cells, e.g., those damaged by capture. Select cells by clicking points in the plot and/or using "Auto-identify", then click "Remove selected".





Selected cells:

Normalization

Adjust expression levels to correct for artificial differences between cells, e.g., differences in sequencing depth. When a rescaling/normalization button is clicked, the box plot (showing expression statistics for up to 96 randomly selected cells) will reflect changes. For example, clicking "Rescaling to geometric mean" will cause red dots (geometric means) to align. Note that clicking more than one rescaling/normalization button will apply adjustments on already adjusted values (use "Reset" to go back to unadjusted data).


Gene filtering

Remove genes having very low expression and/or those with little variation (dispersion) by moving the sliders. It is recommended to keep at least 2,000 genes.



Starting number of genes:
Post-filtering number of genes:

Clustering

Select a clustering method and enter a number of clusters (or check the box for auto selection), then click "Run clustering".




Differential expression

Identify differentially expressed genes between clusters. The number of cores can be set to 2 and will run for approximately 30 minutes on the Kim, et al. 2016 dataset (116 cells, 3,788 genes, 3 clusters), when using a VirtualBox Appliance having 8 GB RAM and an Intel I7 processor. Note: the progress bar will not accurately reflect progress, please give the calculations time to complete.

Once complete, the enrichment of differentially expressed genes in KEGG pathways and GO terms can be calculated.





Tabs indicate cluster numbers. Genes are sorted by absolute Z-score.

Protein network

Proteins from top differentially expressed genes are visualized with connecting lines indicating documented biochemical interactions. Go to the next step by clicking "Proceed" (bottom right of page).



Pseudo-time construction

Cells are ordered in pseudo-time using similarities between their expression profiles.