Setup a hadoop cluster
1. Software
Windows8 + VMWare-Workstation9.0 + Ubuntu-13.04-x64-server + hadoop-2.1.0-beta
Name the first ubuntu installation as "master", this will be the master server (namenode + secondarynamenode)
then snapshot/clone 3 images from "master" and name them as "n1", "n2" and "n3". These 3 images will be the slave nodes.
2. Hadoop configuration
Too many steps so I will not list them one by one in here.
Google is your friend. DO not forget to change the /etc/hosts, /etc/hostname in n1, n2 and n3. Start the hadoop and open a web browser to see if it works http://192.168.1.2:8088/cluster
192.168.1.2 is the IP of my master server.
You should be able to see 3 active nodes in the Cluster Metrics table.
Get all necessary apps and data
1. install the BWA package
- #get the code
- git clone https://github.com/lh3/bwa.git
- cd bwa
- #install two required libs
- sudo apt-get install gcc
- sudo apt-get install zlib1g-dev
- #compile
- make
- #delete other files, only keep the executable BWA
- find . ! -name "bwa" -exec rm -fr {} \;
2. Download some reference genome data - for test purpose, we don't need to get all of them
- wget http://hgdownload.soe.ucsc.edu/goldenPath/hg19/chromosomes/chr1.fa.gz
- wget http://hgdownload.soe.ucsc.edu/goldenPath/hg19/chromosomes/chr2.fa.gz
- zcat *.gz > hg19.fasta
3. Build genome index for BWA
- bwa index -p hg19 hg19.fasta
########################################
#/etc/hosts
192.168.221.128 prime
192.168.221.133 n1
192.168.221.139 n2
192.168.221.140 n3
#/etc/hostname
prime
sudo apt-get install vim
sudo apt-get install openssh-server
sudo apt-get install screen
sudo apt-get install openjdk-7-jdk
sudo apt-get install eclipse-platform
wget http://apache.cs.utah.edu/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz
tar -zxvf hadoop-2.2.0.tar.gz
#.bash_profile
alias hj="hadoop jar"
alias hf="hadoop fs"
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export DISPLAY=192.168.1.133:0.0
export HADOOP_HOME=/home/hadoop/hadoop-2.2.0
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/hadoop/hadoop-2.2.0/lib/native
export EC2_HOME=$HOME/tools/ec2-api-tools-1.6.11.0
export PATH=$PATH:/home/hadoop/hadoop-2.2.0/bin
export PATH=$PATH:/home/hadoop/hadoop-2.2.0/sbin
export PATH=$PATH:/home/hadoop/tools/bwa:/home/hadoop/tools/snap-0.15.4-linux
export PATH=$PATH:$HOME/tools/ec2-api-tools-1.6.11.0/bin
#hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
#core-site.xml
fs.default.name
hdfs://prime:50011/
true
#/etc/hosts
192.168.221.128 prime
192.168.221.133 n1
192.168.221.139 n2
192.168.221.140 n3
#/etc/hostname
prime
sudo apt-get install vim
sudo apt-get install openssh-server
sudo apt-get install screen
sudo apt-get install openjdk-7-jdk
sudo apt-get install eclipse-platform
wget http://apache.cs.utah.edu/hadoop/common/hadoop-2.2.0/hadoop-2.2.0.tar.gz
tar -zxvf hadoop-2.2.0.tar.gz
#.bash_profile
alias hj="hadoop jar"
alias hf="hadoop fs"
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export DISPLAY=192.168.1.133:0.0
export HADOOP_HOME=/home/hadoop/hadoop-2.2.0
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/hadoop/hadoop-2.2.0/lib/native
export EC2_HOME=$HOME/tools/ec2-api-tools-1.6.11.0
export PATH=$PATH:/home/hadoop/hadoop-2.2.0/bin
export PATH=$PATH:/home/hadoop/hadoop-2.2.0/sbin
export PATH=$PATH:/home/hadoop/tools/bwa:/home/hadoop/tools/snap-0.15.4-linux
export PATH=$PATH:$HOME/tools/ec2-api-tools-1.6.11.0/bin
#hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
#core-site.xml
#hdfs-site.xml
#yarn-site.xml
#slaves
n1
n2
n3
No comments:
Post a Comment