Friday, April 10, 2015

Lightweight MRAppMaster?

hadoop v2 moves the master application from Master Node to one container. The pro is that it greatly reduced the worload on Master Node while launching multiple MapReduce applications and easily make the hadoop framework more scalable beyond thousands of Worker Nodes.

However this brings a problem in a small cluster- the Master Application "MRAppMaster" now will occupy a whole container. 

Supposedly we have five Worker Nodes and every node has just one 
container. Since "MRAppMaster" will use one container, then you have only four containers can be used for real processing. 20% of computing resource were "wasted". We can mitigate this problem by assigning two containers per node. By this way only 10% of computing resource were wasted. However if we divide a node into two containers, then every container's memory and CPU will be cut into half too. The memory size is very precious in many bioinformatics applications. With less than 8G memory your aligner probably will fail.

If your hadoop cluster has more than 10 nodes, then do not bother to take it into consideration.


No comments:

Post a Comment