Tuesday, May 22, 2012

microRNA differential expression

Goal: 
Find the microRNAs that are differentially expressed (DE) between two groups. Each group consists of 6 ~ 7 biological samples which go through illumina HiSeq2000 sequencer. Single-end with 50bp length.

Method:
1. download gene table for microRNA from miRbase
>wget ftp://mirbase.org/pub/mirbase/CURRENT/genomes/hsa.gff3


2. convert it from GFF into BED format(and filter non-microRNA)
>awk '/^#/{print;next;} $3 == "miRNA" {print "chr"$1"\t"$4"\t"$5"\t"$9}' hsa.gff3 > hsa.bed


3. Align reads to hg19 whole genome using novoalign
>novoalign -r None -d hg19.nix -f A_1.fq.gz A_2.fq.gz | gzip > A.sam.gz
>java -jar ~/picard/SortSam.jar INPUT=A.sam.gz OUTPUT=A.bam CREATE_INDEX=true SO=coordinate VERBOSITY=ERROR QUIET=true
>mv A.bai A.bam.bai 


4. Counting with bedtools
>~/BEDTools-Version-2.16.2/bin/bedtools multicov -bams A.bam -bed hsa.bed | awk '$4 !~ 0 {print $0}' > A.txt

(to be continued)

No comments:

Post a Comment