Data normalization for RNA-Seq


The raw read counts are normalized using DESeq2 package. DESeq2 performs an internal normalization where geometric mean is calculated for each gene across all samples. The counts for a gene in each sample is then divided by this mean. The median of these ratios in a sample is the size factor for that sample.


For heatmap, a Z-score normalization is performed on the normalized read counts across samples for each gene. Z-scores are computed on a gene-by-gene (row-by-row) basis by subtracting the mean and then dividing by the standard deviation. The computed Z score is then used to plot heatmap. 

Genes with dark red are up-regulated and blue are down-regulated. Since the rows (genes) are Z-Score scaled, the colors represent a single gene’s varying expression across the samples.