#download app
$wget https://rna-star.googlecode.com/files/STAR_2.3.0e.Linux_x86_64.tgz
$tar -zxvf STAR_2.3.0e.Linux_x86_64.tgz
$cd STAR_2.3.0e.Linux_x86_64/
#splice junction data
$wget ftp://ftp2.cshl.edu/gingeraslab/tracks/STARrelease/STARgenomes/SpliceJunctionDatabases/gencode.v14.annotation.gtf.sjdbcd
#create the dir for index genome
$mkdir hg19
#generate index genome with splice junction annotations
$./STAR --runMode genomeGenerate --genomeDir hg19 --genomeFastaFiles /projects/confidential_sequence/home/sun/data/hg19/hg19.fa --runThreadN 4 --sjdbFileChrStartEnd gencode.v14.annotation.gtf.sjdb --sjdbOverhang 99
$mv hg19 ~/data/star_hg19
$cd ~
$mkdir -p test/STAR; cd ~/test/STAR
#full path of input files does NOT work. instead, create softlinks on the local folder.
$ln -s
#run
$time STAR_2.3.0e.Linux_x86_64/STAR --genomeDir data/star_hg19 --readFilesIn *.fastq.gz --outFileNamePrefix
The output is a SAM file with name "OUTPUT.Aligned.out.sam"
By using suffix tree algorithm, STAR uses lots of (>30GB) memory in exchange of speed.
No comments:
Post a Comment