snakePipes

snakePipes are pipelines built using snakemake and python for the analysis of epigenomic datasets.

Below is the list of pipelines available in snakePipes

Pipeline	Description
createIndices	Create indices for an organism for further use within snakePipes
DNAmapping	Basic DNA mapping using bowtie2, filter mapped files, QC and create coverage plots
ChIPseq	Use the DNA mapping output and run ChIP/Input normalization and peak calling
ATACseq	Use the DNA mapping output and detect open chromatin regions for ATACseq data
HiC	HiC analysis workflow, from mapping to TAD calling
makePairs	pairtools workflow, from allele-specific mapping to HiC matrices
ncRNAseq	ncRNAseq workflow : From mapping to differential expression of genes and repeat elements using DESeq2
mRNAseq	RNASeq workflow : From mapping to differential expression using DESeq2
scRNAseq	Single-cell RNA-Seq (CEL-Seq2) workflow : From mapping to differential expression
WGBS	Whole-genome Bisulfite-Seq analysis workflow, from mapping to DMR calling and differential methylation analysis
preprocessing	Merging technical replicates (e.g., across lanes), removing optical duplicates, running FastQC

Quick start

Assuming you have python3 with conda, install the latest version of snakePipes with:

conda create -n snakePipes -c conda-forge -c bioconda -c mpi-ie snakePipes

You can update snakePipes to the latest version available on conda with:

conda update -n snakePipes -c conda-forge -c bioconda -c mpi-ie --prune snakePipes

Download genome fasta and annotations for an your organism, and build indexes, Check in createIndices
Configure snakePipes with paths to organism and snakemake configs on your system using snakePipes config. Importantly, take care to set --condaEnvDir parameter, which defaults to /tmp. For detailed information, run:

snakePipes config --help

Note

If you have a copy of a shared/defaults.yaml with the necessary paths configured (i.e. from a previous installation), you can pass it to snakePipes config with --oldConfig and --configMode recycle instead of providing all the paths manually again. Config keys have to match for this to work. In the same way, you can pass your external organism yaml folder with --organismsDir.

Download example fastq files for the human genome here
Execute the DNAmapping pipeline using the example command.sh in the test data directory.

Running your own analysis

For a detail introduction to setting up snakePipes from scratch, please visit Setting up snakePipes

For each organism of interest, snakePipes requires fasta files, genome indexes and annotation files. Paths to these files are specified in the organism/<name>.yaml files. After installation, the location of these files could be revealed by the following command:

snakePipes info

You could either modify the existing files (add your own paths), or add a new file there. See more detail in Running snakePipes

snakePipes could either be executed locally, or on any snakemake-supported cluster infrastructure. See details for setting up the cluster command in Running snakePipes

Citation

If you adopt/run snakePipes for your analysis, cite it as follows :

Bhardwaj V, Heyne S, Sikora K, Rabbani L, Rauer M, Kilpert F, Richter AS, Ryan DP, Manke T. snakePipes: facilitating flexible, scalable and integrative epigenomic analysis. Bioinformatics. 2019 May 27. pii: btz436. doi:

10.1093/bioinformatics/btz436. [Epub ahead of print] PubMed PMID: 31134269. https://www.ncbi.nlm.nih.gov/pubmed/31134269

This tool suite is developed by the Bioinformatics Unit at the Max Planck Institute for Immunobiology and Epigenetics, Freiburg.

Help and Support

For query/questions regarding snakePipes, please write on biostars with the tag #snakePipes

For feature requests or bug reports, please open an issue on our GitHub Repository.

Contents:

Indices and tables

code @ github.