Tuesday, June 5, 2012

fastq version


Reference: http://en.wikipedia.org/wiki/FASTQ_format

Sanger format can encode a Phred quality score from 0 to 93 using ASCII 33 to 126

Illumina 1.3+ format can encode a Phred quality score from 0 to 62 using ASCII 64 to 126

So we can take few samples and see the range of scores. If any scores fall into the range of 33 ~ 63 then

it would be CASAVA 1.3~1.7(Illumina format). Otherwise it would be CASAVA 1.8 (Sanger format, which is gaining popularity)

For BWA, use "bwa aln -I" for Illumina format. For Sanger format, remove the "-I".

For Novoalign, use "novoalign -F ILMFQ" for Illumina format. For Sanger format, use "novoalign -F STDFQ" or "novoalign -F ILM1.8".