Monday, June 20, 2011

SNP Filter

#uses SB >= 0.10 filter
#uses QD < 5.0 filter
#uses HRun >= 4 filter
#uses clustered SNPs (>= 3 SNPs within 10 bp) filter
#masks out SNPs that overlap the filtered indels

java -Xmx32g -jar $HOME/tool/gatk/GenomeAnalysisTK.jar -T VariantFiltration -R $HOME/data/hg19.fasta \
-o x1.snp.filtered.vcf \
-B:variant,VCF x1.snp.vcf \
-B:mask,VCF ../INDEL/x1.indel.vcf \
--maskName InDel \
--clusterSize 3 \
--clusterWindowSize 10 \
--filterExpression "SB >= 0.10 || QD < 5.0 || HRun >= 4" \
--filterName "SB>=0.1;QD<5.0;HRun>=4;Cluster_3in10;MaskIndel"

#copy header and "PASS" items to "x1.snp.passed.vcf"

awk '/^#/ { print $0;next} $7 == "PASS" {print $0} ' x1.snp.filtered.vcf > x1.snp.passed.vcf



No comments:

Post a Comment