Alignment workflow and Report page overview
The alignment of each individual sample starts with fastq data, performs QC, aligns reads to the genome using Bowtie2, removes duplicate reads, and creates genomic coverage files. Summary statistics and plots are provided to allow for a quick assessment of data quality and an embedded genome browser provides a direct look at the data.
Results
Under the "Report" tab, a series of interactive and downloadable plots allow for in-depth data exploration
Quality Scores
-Average quality scores at each base of the raw sequencing reads and also after trimming
Number of Reads
-a sankey plot summarizing the trimming and alignment steps
Coverage
-a summary plot of read-coverage throughout the genomeInsert Size
-distribution of the insert-sizes for paired-end data
Gene Body and Transcription Start Site
-Heatmaps and average profiles of normalized signal at TSS's and gene bodies
Genome Browser
-an interactive, embedded IGV browser session displaying the normalized signal bigwig track
*multiple genomic loci can be visualized by specifying them with spaces in-between:
chr8:128,740,266-128,761,729 chr10:3,818,235-3,836,806 chr19:16,433,128-16,438,505
Output files
Under the "Info" tab, intermediate files produced by the workflow are available for download:
QC, Trim
trim/<SAMPLE_NAME>.trim.report.html (can be opened in your web browser or downlaoded)
-a detailed report of raw sequencing quality, base content, estimates of PCR duplication level, insert size distribution, adapter content, and kmer overrepresentation
Align (Bowtie2)
bowtie/<SAMPLE_NAME>.<genome>.bam.bai
-index file of the raw alignment bam file
bowtie/<SAMPLE_NAME>.<genome>.bam
-compressed BAM file of the raw alignments
Summary
summary/<SAMPLE_NAME>.<genome>.alignment-summary.png
-a summary plot of reads aligning to a single place in the genome ("unique"), multiple places, and unaligned
Remove duplicates
dedup/<SAMPLE_NAME>.<genome>.dedup.bam.bai
-index file of the de-duplicated alignment bam file
dedup/duplicate_reads.stats
-a text file containing the statistics on PCR duplication level
dedup/<SAMPLE_NAME>.<genome>.dedup.bam
-compressed BAM file of the alignments after removing PCR duplicates ("de-duplicated")
Coverage
coverage/<SAMPLE_NAME>.<genome>.dedup.coverage-summary.xls
-summary statistics of genome-wide coverage
Insert size
insert_size/<SAMPLE_NAME>.<genome>.dedup.insert-size-histogram.png
-raw picard tools insert size distribution plot
BigWig
bam_to_bigwig/<SAMPLE_NAME>.<genome>.dedup.bigwig
-normalized, compressed signal track in bigwig format
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article