Wednesday, December 8, 2010

make a simulation for 7550R

#train parameters from the raw fastq file (101bp, pair-end)

>maq simutrain 7550.simupars.dat 7550X1_101122_SN141_0314_B80R4NABXX_1_1.txt


#we want to make test on hg19 Chr20 only. Chr20 has ~63 million(63025520) bp , to get a 30x coverage with 101bp read length, we need 30*63000000/101 = 18712871 reads. (rounds to 19 million reads) .

>maq simulate -d 200 -s 20 -N 19000000 7550.chr20.1.fq 7550.chr20.2.fq chr20.fa 7550.simupars.dat > 7550.chr20.snps

#do alignment
>novoalign -o SAM $'@RG\tID:YING\tPL:ILLUMINA\tLB:LB_TEST\tSM:6966R\tCN:HCI' -F ILMFQ -d chr20.ndx -f 7550.chr20.1.fq 7550.chr20.2.fq > 7550.chr20.sam 2>7550.chr20.log

No comments:

Post a Comment