Goal: Find the microRNAs that are differentially expressed (DE) between two groups. Each group consists of 6 ~ 7 biological samples which go through illumina HiSeq2000 sequencer. Single-end with 50bp length. Method: 1. download gene table for microRNA from miRbase >wget ftp://mirbase.org/pub/mirbase/CURRENT/genomes/hsa.gff3 2. convert it from GFF into BED format(and filter non-microRNA) >awk '/^#/{print;next;} $3 == "miRNA" {print "chr"$1"\t"$4"\t"$5"\t"$9}' hsa.gff3 > hsa.bed 3. Align reads to hg19 whole genome using novoalign >novoalign -r None -d hg19.nix -f A_1.fq.gz A_2.fq.gz | gzip > A.sam.gz >java -jar ~/picard/SortSam.jar INPUT=A.sam.gz OUTPUT=A.bam CREATE_INDEX=true SO=coordinate VERBOSITY=ERROR QUIET=true >mv A.bai A.bam.bai 4. Counting with bedtools >~/BEDTools-Version-2.16.2/bin/bedtools multicov -bams A.bam -bed hsa.bed | awk '$4 !~ 0 {print $0}' > A.txt (to be continued)
Tuesday, May 22, 2012
microRNA differential expression
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment