Monday, November 8, 2010

Pileup format

samtools

chr1 808922 G A 21 116 53 9 aAaaaaaAA 9=;:@?9:,


chr1 ---> name of this sequence is chr1
808922 ---> coordinate is 808922 (1-based)
G ---> reference base is 'G'
A ---> consensus base is 'A'
21 ---> consensus quality is 21
116 ---> SNP quality is 116
53 ---> maximum mapping quality is 53
9 ---> the number of reads covering the site is 9
aAaaaaaAA ---> read bases are 'aAaaaaaAA'
9=;:@?9:, --->base qualities are '9=;:@?9:,'


consensus quality (21) is the Phred-scaled probability that the consensus is wrong. SNP quality (116) is the Phred-scaled probability that the consensus is identical to the reference.
For SNP calling, SNP quality is of more importance.

For example, to further filter the SNP calls:
samtools.pl varFilter raw.pileup | awk '$6>=20' > final.pileup

No comments:

Post a Comment