snakePipes

_images/snakePipes_small.png

snakePipes are pipelines built using snakemake and python for the analysis of epigenomic datasets.

Below is the list of pipelines available in snakePipes

Pipeline

Description

createIndices

Create indices for an organism for further use within snakePipes

DNA-mapping

Basic DNA mapping using bowtie2, filter mapped files, QC and create coverage plots

ChIP-seq

Use the DNA mapping output and run ChIP/Input normalization and peak calling

ATAC-seq

Use the DNA mapping output and detect open chromatin regions for ATAC-seq data

HiC

Hi-C analysis workflow, from mapping to TAD calling

noncoding-RNA-seq

noncoding-RNA-Seq workflow : From mapping to differential expression of genes and repeat elements using DESeq2

mRNA-seq

RNA-Seq workflow : From mapping to differential expression using DESeq2

scRNA-seq

Single-cell RNA-Seq (CEL-Seq2) workflow : From mapping to differential expression

WGBS

Whole-genome Bisulfite-Seq analysis workflow, from mapping to DMR calling and differential methylation analysis

preprocessing

Merging technical replicates (e.g., across lanes), removing optical duplicates, running FastQC

Quick start

  • Assuming you have python3 with conda, install the latest version of snakePipes with:

conda install mamba -c conda-forge && mamba create -n snakePipes -c mpi-ie -c conda-forge -c bioconda snakePipes
  • You can update snakePipes to the latest version available on conda with:

mamba update -n snakePipes -c mpi-ie -c conda-forge -c bioconda --prune snakePipes

snakePipes is going to move to mamba in the future.

  • Download genome fasta and annotations for an your organism, and build indexes, Check in createIndices

  • Configure snakePipes with paths to organism and cluster configs on your system using snakePipes config. For detailed information, run:

snakePipes config --help

Note

If you have a copy of a shared/defaults.yaml with the necessary paths configured (i.e. from a previous installation), you can pass it to snakePipes config with --oldConfig and --configMode recycle instead of providing all the paths manually again. Config keys have to match for this to work. In the same way, you can pass your external organism yaml folder with --organismsDir or cluster config with --clusterConfig.

  • Download example fastq files for the human genome here

  • Execute the DNA-mapping pipeline using the example command.sh in the test data directory.

Running your own analysis

For a detail introduction to setting up snakePipes from scratch, please visit Setting up snakePipes

For each organism of interest, snakePipes requires fasta files, genome indexes and annotation files. Paths to these files are specified in the organism/<name>.yaml files. After installation, the location of these files could be revealed by the following command:

snakePipes info

You could either modify the existing files (add your own paths), or add a new file there. See more detail in Running snakePipes

snakePipes could either be executed locally, or on any snakemake-supported cluster infrastructure. See details for setting up the cluster command in Running snakePipes

Citation

If you adopt/run snakePipes for your analysis, cite it as follows :

Bhardwaj V, Heyne S, Sikora K, Rabbani L, Rauer M, Kilpert F, Richter AS, Ryan DP, Manke T. snakePipes: facilitating flexible, scalable and integrative epigenomic analysis. Bioinformatics. 2019 May 27. pii: btz436. doi:

10.1093/bioinformatics/btz436. [Epub ahead of print] PubMed PMID: 31134269. https://www.ncbi.nlm.nih.gov/pubmed/31134269

_images/logo_mpi-ie.jpg

This tool suite is developed by the Bioinformatics Unit at the Max Planck Institute for Immunobiology and Epigenetics, Freiburg.

Help and Support

For query/questions regarding snakePipes, please write on biostars with the tag #snakePipes

For feature requests or bug reports, please open an issue on our GitHub Repository.

Contents:

Indices and tables

code @ github.