ChIP-Seq macs2 peak analysis

Modified on Sat, 2 Sep, 2023 at 1:50 PM

Peak calling workflow and Report page overview

ChIP-Seq peak calling using macs2 starts with the de-duplicated BAM file produced by our alignment and QC pipeline, calls peaks, annotates peaks with genes based on proximity, provides summary statistics of the samples used in the peak-calling pipeline and an embedded genome browser to directly view all datasets combined.

Results

Peaks Distribution

-Summary plots of peak locations relative to genes

Peaks table

-peaks called in your data with their underlying statistics and annotation of nearby genes

(detailed descriptions of the information in this table below)

Correlation and Scatterplots

-pairwise correlation heatmap and scatterplot of samples used in the analysis

Genome Browser

-an interactive, embedded IGV browser session displaying the normalized signal bigwig tracks of all samples used in the analysis

*multiple genomic loci can be visualized by specifying them with spaces in-between:chr8:128,740,266-128,761,729 chr10:3,818,235-3,836,806 chr19:16,433,128-16,438,505

FRiP table

-Fraction of Reads in Peaks (FRiP) quantifies the enrichment of biological signal (reads) in called peaks

Enrichment

-Gene set enrichment analysis results of genes associated with peaks

(detailed descriptions of the information in this table below)

Motifs

-motifs found to be enriched in peaks and the transcription factors known to bind to the motif

*further details on output tables below

Peaks table

-peaks called in your data with their underlying statistics and annotation of nearby genes

Chrom, Start, End, Length - genomic coordinates and size of the peak

Pileup - raw count signal at the peak summit

Pval - peak significance values ( -10 * log10 )

Qval - FDR-corrected peak significance values ( -10 * log10 )

Fold - calculation of signal relative to background

Annotation - location of peak in relation to the nearest (or overlapping) gene and it's features (ie. TSS, exons)

Accession - gene ID of nearest gene(s)

Symbol - gene symbol/name of nearest gene(s)

Enrichment

-Gene set enrichment analysis results of genes associated with peaks

Database - source of gene sets (or genomic regions/elements) that peaks are tested against

GO Biological Process - (Gene Ontology) gene sets associated with specific biological processes

GO Molecular Function - (Gene Ontology) gene sets associated with specific molecular functions

GO Cellular Component - (Gene Ontology) gene sets associated with specific cellular components

WikiPathways - gene sets associated with various cellular pathways and functions

Chromsome - distinct regions of chromosomes identified based on banding patterns

EMBL-EBI-Pfam - protein family database identified by multi-sequence alignment (now in InterPro)

Gene3D - protein domain database, classification at the level of superfamily

Interactions - gene sets known to interact with a specific protein, available through the NIH

InterPro - database of protein families and their function

miRNA - miRNA targets database from mirDB

MSIGDB - gene sets associated with a variety of molecular and cellular functions

PRINTS - protein family "fingerprints" database based on domain conservation (now part of InterPro)

Prosite - database of protein families, domains, and functional sites

SMART - protein domain database

TermID - a unique identifier of this gene set from its database

Term - name of the gene set

Enrichment logP - log-scaled significance of enrichment (p-value) for the gene set

Genes in Term - total number of genes in this gene set

Target Genes in Term - number of genes shared between your dataset and this gene set

Total Target Genes - total number of genes in your dataset

Total Genes in Database - Total number of genes in the gene set database

Fraction of Targets in Term - genes shared between your dataset and this gene set as a percent of the total number of genes in your dataset

Targets as Fraction of Genes - genes shared between your dataset and this gene set as a percent of the number of genes in the gene set

Output files

Under the "Info" tab, intermediate files produced by the workflow are available for download:

MACS2
macs2/<ANALYSIS_NAME.<genome>.macs2_peaks.xls

-raw peak calls from macs2
Annotation
annotate/macs2/<ANALYSIS_NAME.<genome>.macs2_peaks.annotated.xls

-raw peaks annotated with nearby genes
annotate/macs2/<ANALYSIS_NAME.<genome>.macs2_peaks.filtered.xls

-remaining peaks after removal of those in blacklisted regions
annotate/macs2/<ANALYSIS_NAME.<genome>.macs2_peaks.homer_anno_raw.xls

-peaks and their annotations from the HOMER tool

ChipSeeker Docker
chip_seeker/chip_seeker/Rplots.pdf

-summary pie chart of peak annotations displaying their proximity to specific gene elements

chip_seeker/chip_seeker/covplot.png

-display of peaks along each chromosome

chip_seeker/chip_seeker/plotannopie.png

-summary pie chart of peak annotations displaying their proximity to specific gene elements

chip_seeker/chip_seeker/upsetplot.png

-upset plot of peak annotations displaying their proximity to specific gene elements

chip_seeker/chip_seeker/upsetvennpie.png

-plot combining both the pie chart and upsetplotEnrichment
enrichment/macs2/<ANALYSIS_NAME.<genome>.macs2_peaks.homer_anno_raw_go_summary.xls

-gene ontology enrichment results from the HOMER tool
Homer
homer_view/all_motif_data.json

-json file used to display motif results in the report page
homer_view/individual_motif_data.json

-json file used to display motif results in the report page
homer_view/macs2/<ANALYSIS_NAME.<genome>.macs2_peaks.filtered.motifs.zip

-compressed folder of full HOMER tool results
BigwigSummary
multiBigwigSummary/readCounts.npz

-normalized counts in peaks for all samples used in the analysis in npz format
multiBigwigSummary/readCounts.tab

-normalized counts in peaks for all samples used in the analysis in tab-separated format
PlotCorrelation
plotCorrelation_heatmap/outFileCorMatrix.tab

-correlation matrix of samples used in analysis
plotCorrelation_heatmap/heatmap.plotCorrelation.png

-correlation heatmap

PlotCorrelation
plotCorrelation_scatterplot/scatterplot.plotCorrelation.png

-pairwise correlation plots