Alignment workflow

The alignment workflow starts with the fastq data and performs QC, aligns reads to the genome, remove duplicate reads and create genomic coverage files.

The base quality for fastq data
All QC results
Sample.genome.bam [.bai]
Aligned bam files
Sample.genome.dedup.bam [.bai]
Aligned and deduplicated bam files, removing duplicated reads
A figure showing % of reads that aligned
The coverage information in bigwig format. This file may be used with UCSC genome browser, IGV browser, etc.

Peak calling workflow

The peak calling workflow uses bam files from the alignment workflow to find peaks in a sample alone or differential peaks compared to a control. It annotates the peaks to show overlap with promoters, closest genes, etc., filters out the peaks overlapping with Satellite repeat regions (likely noise) and then makes list of Promoter, Genebody or Intergenic targets. Finally, it finds known and novel motifs in the peaks.

Macs peaks in bed format, may be loaded in IGV or other browsers
Peaks with annotation information added, showing overlap with promoter, overlap with gene body or closest genes.
Removing peaks that overlap with Satellite regions, as such peaks are likely to be noise.
Lists of target genes
Motifs found in the peaks. There are 2 types of results: a) de-novo search for over-represented n-mers and then match them to known motifs (homerMotifs.html) and b) take known motifs and search for them in the peaks (knownMotifs.html).