What quality control metrics are important in ATAC-seq data?

Modified on Thu, 3 Dec, 2020 at 10:14 PM

ATAC-seq, like other next NGS types, involves numerous steps including library preparation, base calls and read mapping. Each step has to be evaluated to ensure the quality of the results.


Library complexity

Library complexity indicates the number of unique reads in the library that is being sequenced and it positively correlates with the number of cells used. In ATAC-seq too few cells can cause over transposition (or over digestion), leading to an abundance of small fragments, and too many cells cause under transposition, which leads to an abundance of very large fragments. 


Base quality

The quality of a base call is defined by a score (PHRED score), which is calculated using the probability that there is an error in calling a nucleotide. The higher the PHRED score, the higher the quality, the lower the probability of an error. 


Alignment

Once base calls are compiled into reads, it is paramount to align them accurately onto the genome. Two metrics are a good reflection of the quality of the sequencing reaction. Alignment rate indicates the percentage of the reads that map onto the genome, with optimal values being above 90%. The quality of the alignment, or how well a read matches a specific region of the genome, is reported as a MAPQ score. The highest MAPQ value (255) goes to reads that are unique for a single region, while the lowest (0) goes to those that have a lot of matching regions.


Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article