ATAC-seq, like other next NGS types, involves numerous steps including library preparation, base calls and read mapping. Each step has to be evaluated to ensure the quality of the results.
Library complexity indicates the number of unique reads in the library that is being sequenced and it positively correlates with the number of cells used. In ATAC-seq too few cells can cause over transposition (or over digestion), leading to an abundance of small fragments, and too many cells cause under transposition, which leads to an abundance of very large fragments.
The quality of a base call is defined by a score (PHRED score), which is calculated using the probability that there is an error in calling a nucleotide. The higher the PHRED score, the higher the quality, the lower the probability of an error.
Once base calls are compiled into reads, it is paramount to align them accurately onto the genome. Two metrics are a good reflection of the quality of the sequencing reaction. Alignment rate indicates the percentage of the reads that map onto the genome, with optimal values being above 90%. The quality of the alignment, or how well a read matches a specific region of the genome, is reported as a MAPQ score. The highest MAPQ value (255) goes to reads that are unique for a single region, while the lowest (0) goes to those that have a lot of matching regions.