Data normalization for RNA-Seq & Heatmap Generation

Modified on Sat, 10 Jul, 2021 at 8:48 AM

Data normalization for RNA-Seq


The raw read counts are normalized using DESeq2 package. DESeq2 performs an internal normalization where geometric mean is calculated for each gene across all samples. The counts for a gene in each sample is then divided by this mean. The median of these ratios in a sample is the size factor for that sample.


For heatmap, a Z-score normalization is performed on the normalized read counts across samples for each gene. Z-scores are computed on a gene-by-gene (row-by-row) basis by subtracting the mean and then dividing by the standard deviation. The computed Z score is then used to plot heatmap. 


Genes with dark red are up-regulated and blue are down-regulated. Since the rows (genes) are Z-Score scaled, the colors represent a single gene’s varying expression across the samples.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article