Monday, April 20, 2015

Run a BWA job with docker

To run a BWA mappping job inside a docker, I want to create three containers. one for "bwa" executable, one for "bwa" genome index, and one for the input fastq files. All the benefits can be summarized by one word: "isolation".
  • 1. The bwa executable application. I would put it into a volume /bioapp/bwa/0.7.9a/ in an docker image named "yings/bioapp"
  • 2. The reference genome index which was created using "bwa index". I would put it into a volume /biodata/hg19/index/bwa/ in a docker image named "yings/biodata"
  • 3. The input FASTQ files. Assuming they can be found under "/home/hadoop/fastq" in the host.
Create a image name "bioapp" with tag "v04202015" copy all files and folders under "app" to the "/bioapp/" in the container. For simplicity here other operations like installing dependencies were not included in the Dockerfile.
The Dockerfile looks like this:

    FROM ubuntu:14.04
    RUN mkdir -p /bioapp
    COPY app /bioapp/
    VOLUME /bioapp
    ENTRYPOINT /usr/bin/tail -f /dev/null
 
build the bioapp image

-$docker build -t yings/bioapp:v04202015 .
Create a image name "biodata" with tag "v04202015" copy all files and folders under "data" to the "/biodata/" in the container
 $cat >Dockerfile<
 FROM ubuntu:14.04
 RUN mkdir -p /biodata
 COPY data /biodata/
 VOLUME /biodata
 ENTRYPOINT /usr/bin/tail -f /dev/null
 EOF
Build the bioapp image
 $docker build -t yings/biodata:v04202015 .
Start the biodata container as daemon, name it as "biodata"
$docker run -d --name biodata yings/biodata:v04202015
Start the bioapp container as daemon, name it as "bioapp"
$docker run -d --name bioapp yings/bioapp:v04202015
Now we should have two data volume containers running in the backend. It is time to launch the final executor container
$docker run -it --volumes-from biodata --volumes-from bioapp -v /home/hadoop/fastq:/fastq ubuntu:14.04 /bin/bash
Those parameters mean: "-it" run the executor container interactively "--volumes-from biodata" load the data volume from container "biodata" (do not confused it with image "yings/biodata") "--volumes-from bioapp" load the data volume from container "bioapp" (again, do not confused it with image "yings/bioapp") "-v /home/hadoop/fastq:/fastq ubuntu:14.04" mount the volume "/home/hadoop/fastq" in host to "/fastq" in the executor container. "ubuntu:14.04" this is the standard image, as our OS "/bin/bash" command to be executed as entry point. If everything goes well, you will see you are root now in the executor container
root@5927eecc8530:/#
Is bwa there?
root@5927eecc8530:/# ls -R /bioapp/
Is genome index there?
root@5927eecc8530:/# ls -R /biodata/
Is fastq there?
root@5927eecc8530:/# ls -R /fastq/
Launch the job, save the alignment as "/fastq/output/A.sam"
root@5927eecc8530:/#/bioapp/bwa/0.7.9a/bwa mem -t 8 -R '@RG\tID:group_id\tPL:illumina\tSM:sample_id' /biodata/index/hg19/bwa/ucsc.hg19 /fastq/A_R1.fastq.gz  /fastq/A_R2.fastq.gz > /fastq/A.sam
The process should be complete in a few minutes because the input fastq files are very small, as a test. Now you can safely terminate the executor container by pressing "Ctrl-D". Since we previously mount the volume "/home/hadoop/fastq" in host to "/fastq" in the executor container, now back to the host, we will see the persistent output "A.sam" under "/home/hadoop/fastq". However, if we use the "/biodata" or "/bioapp" as the output folder, for example, "bwa ... > /biodata/A.sam", then it is NOT persistent. If you terminate the "biodata" container, all the changes on that container will be lost. (stop & restart the container is OK, as long as it was not terminated).

32 comments:

  1. Excellent blog, I wish to share your post with my folks circle. It’s really helped me a lot, so keep sharing post like this

    selenium training in chennai
    aws training in chennai

    ReplyDelete
  2. This is quite educational arrange. It has famous breeding about what I rarity to vouch. Colossal proverb. This trumpet is a famous tone to nab to troths. Congratulations on a career well achieved. This arrange is synchronous s informative impolite festivity to pity. I appreciated what you ok extremely here.

    java training in chennai | java training in bangalore

    java online training | java training in pune

    ReplyDelete
  3. All are saying the same thing repeatedly, but in your blog I had a chance to get some useful and unique information, I love your writing style very much, I would like to suggest your blog in my dude circle, so keep on updates.
    Data Science Training in Chennai
    Data science training in bangalore
    Data science online training
    Data science training in pune
    Data science training in kalyan nagar
    selenium training in chennai

    ReplyDelete
  4. Wow it is really wonderful and awesome thus it is very much useful for me to understand many concepts and helped me a lot. it is really explainable very well and i got more information from your blog.

    rpa training in Chennai | rpa training in pune

    rpa training in tambaram | rpa training in sholinganallur

    rpa training in Chennai | rpa training in velachery

    rpa online training | rpa training in bangalore

    ReplyDelete
  5. All the points you described so beautiful. Every time i read your i blog and i am so surprised that how you can write so well.
    python training in tambaram
    python training in annanagar
    python training in chennai
    python training in Bangalore

    ReplyDelete
  6. Hello! Someone in my Facebook group shared this website with us, so I came to give it a look. I’m enjoying the information. I’m bookmarking and will be tweeting this to my followers! Wonderful blog and amazing design and style.
    nebosh igc courses in chennai

    ReplyDelete
  7. Some us know all relating to the compelling medium you present powerful steps on this blog and therefore strongly encourage contribution from other ones on this subject while our own child is truly discovering a great deal. Have fun with the remaining portion of the year.
    occupational health and safety course in chennai

    ReplyDelete
  8. That was a great message in my carrier, and It's wonderful commands like mind relaxes with understand words of knowledge by information's.
    Microsoft Azure online training
    Selenium online training
    Java online training
    uipath online training
    Python online training


    ReplyDelete
  9. Thanks for sharing great info with us.

    I wanted to write a little comment to support you and wish you a good continuation All the best for all your blogging efforts.Your good knowledge and kindness in playing with all the pieces were very useful.
    Python Course In Bangalore

    ReplyDelete
  10. It was really a wonderful article and I was really impressed by reading this blog. We are giving all software Courses such as Data science, big data, hadoop, R programming, python and many other course. Big data training in bangalore is one of the reputed training institute in bangalore. They give professional and real time training for all students.

    ReplyDelete
  11. Wow, nice information like your way of explanation, it can be easily understanding.

    Looking for Cloud Computing Training in Bangalore , learn from eTechno Soft Solutions Cloud Computing Training on online training and classroom training. Join today!

    ReplyDelete
  12. Thank you for sharing such a nice and interesting blog with us regarding Java. I have seen that all will say the same thing repeatedly. But in your blog, I had a chance to get some useful and unique information. I would like to suggest your blog in my dude circle.
    Java training in chennai | Java training in annanagar | Java training in omr | Java training in porur | Java training in tambaram | Java training in velachery

    ReplyDelete
  13. I always like and such a super contents of these post.Excellent and very cool idea and great content of different kinds of the valuable information's.

    Java training in Chennai

    Java training in Bangalore

    Java training in Hyderabad

    Java Training in Coimbatore

    Java Online Training

    ReplyDelete
  14. Nice post. Thanks for sharing! I want people to know just how good this information is in your article. It’s interesting content and Great work.wonderful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries.

    Data Science Training In Chennai

    Data Science Online Training In Chennai

    Data Science Training In Bangalore

    Data Science Training In Hyderabad

    Data Science Training In Coimbatore

    Data Science Training

    Data Science Online Training

    ReplyDelete
  15. The strategy you have posted on this technology helped me to get into the next level and had lot of information in it.data science course in Hyderabad

    ReplyDelete
  16. Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing.Nice article i was really impressed by seeing this article, it was very interesting and it is very useful for me..
    java training in chennai

    java training in tambaram

    aws training in chennai

    aws training in tambaram

    python training in chennai

    python training in tambaram

    selenium training in chennai

    selenium training in tambaram

    ReplyDelete

  17. Nice article and thanks for sharing with us. Its very informative


    Plots in THIMMAPUR

    ReplyDelete
  18. Take advantage of the advanced career opportunities in the field of Python programming by enrolling in AI Patasala Python Training in Hyderabad.
    Python Course

    ReplyDelete
  19. Your content is nothing short of brilliant in many ways. I think this is engaging and eye-opening material. Thank you so much for caring about your content and your readers.
    data analytics courses in hyderabad

    ReplyDelete
  20. Very Informative blog thank you for sharing. Keep sharing.

    Best software training institute in Chennai. Make your career development the best by learning software courses.

    msbi training in chennai
    php training in chennai
    devops training in chennai

    ReplyDelete