Few days ago an user reported that his simple "@align" job has been running more than 5 days in cluster. It is absolutely abnormal. After few hours work I located the 3 causes of this problem. The first two causes are corrupted input FASTQ file, result in incorrect format and size. The last was from novoalign - novoalign processes corrupted FASTQ files for a infinite time without giving any error messages. Before novoalign adding the new feature of validating the input files, we can do a simple validation For pair-end files, we can compare the size (number of reads) firstly. $zcat X1.fq.gz | wc -l $zcat X2.fq.gz | wc -l For single-end file, check out if it can be mod by 4 $expr `zcat X.fq.gz | wc -l` % 4
Thursday, March 21, 2013
Inifinite processing time for novoalign
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment