RNA-Seq Alignment and QC

Modified on Sat, 2 Sep, 2023 at 5:46 AM

Alignment workflow and Report page overview

The alignment of each individual sample starts with fastq data, performs QC, aligns reads to the genome using STAR, provides a raw and normalized gene count table, and creates genomic coverage files. Summary statistics and plots are displayed for a quick assessment of data quality and an embedded genome browser provides a direct look at the data.

Results

Under the "Report" tab, a series of interactive and downloadable plots allow for in-depth data exploration

Quality Scores

-Average quality scores at each base of the raw sequencing reads and also after trimming

Number of reads

-a sankey plot summarizing the trimming and alignment steps

Metrics

-a summary plot of how the reads have aligned relative to gene annotations
("mRNA" represents reads mapping to exons)

Read counts

-raw and normalized gene and transcript counts and summary plots of their distributions

Genome browser
-an interactive, embedded IGV browser session displaying the normalized signal bigwig track

*multiple genomic loci can be visualized by specifying them with spaces in-between:

chr15:61,977,809-61,997,892 chr13:5,848,131-5,883,750 chr8:72,315,172-72,325,543

Output Files

Under the "Info" tab, intermediate files produced by the pipleline are avaialable for viewing or download:

QC, Trim
trim/<SAMPLE_NAME>.trim.report.html

-a detailed report of raw sequencing quality, base content, estimates of PCR duplication level, insert size distribution, adapter content, and kmer overrepresentationAlign (STAR)
star/<SAMPLE_NAME>.<genome>.bam.bai

-index file of the raw alignment bam filestar/<SAMPLE_NAME>.<genome>.bam

-compressed BAM file of the raw alignmentsBigWig
bigwig/<SAMPLE_NAME>.<genome>.norm-RPM.bigwig

-normalized, compressed signal track in bigwig formatExpr count
featurecounts/<SAMPLE_NAME>.<genome>.counts_gene.txt

-text file containing raw and normalized counts for each gene

featurecounts/<SAMPLE_NAME>.<genome>.counts_transcript.txt

-text file containing raw and normalized counts for each transcript