1. wget ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/VCF/00-All.vcf.gz 2. gunzip 00-All.vcf.gz 3. awk '/^#/ {print $0}' 00-All.vcf > head.txt 4. sed -i 's/chrMT/chrM/g' head.txt 5. awk '/^#/ {next}{print $0}' 00-All.vcf | sed 's/^/chr/' > 1.vcf 6. sed -i 's/chrMT/chrM/g' 1.vcf 7. cat head.txt 1.vcf > hg19.dbsnp.vcf 8. IGVTools/igvtools index hg19.dbsnp.vcf 9. awk '/^#/ {next}{print $1}' hg19.dbsnp.vcf | sort |uniq
Many thanks for this point by point list of commands.
ReplyDeleteI have tried the indexing by using Samtools index and I've got the error:
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM file.
Does the IGVTools index have the same problem.
Thanks in advance,
Francesco Maria Calabrese
University of Naples Federico II
Dept. of Molecular Medicine and Medical Biotechnologies
CEINGE Biotecnologie Avanzate
Via Gaetano Salvatore, 486
(giĆ Via Comunale Margherita, 482)
80145 Napoli Italy