Next Generation Sequencing and Data Analysis: Multi-thread and Multi-process

Saturday, June 5, 2010

Multi-thread and Multi-process

Requirement:

Given a huge (~tens of G) FASTQ file from Illunima NGS, align the short sequences in the FASTQ file to a reference genome. Because the file is too huge, so it should be splitted into small segments then assign the small segments to parallel alignment using a multiple CPU SMP or multiple core CPU.

My current solution:
1. Split the big file on-the-fly
2. Using "multiprocessing" and "subprocess" to do the work

Made some preliminary test code on my desktop, which seems promising. The code should be finished in 3 days if everything goes well.

Yesterday I was working on a test code to parallel the executation og

Next Generation Sequencing and Data Analysis

Saturday, June 5, 2010

Multi-thread and Multi-process

No comments:

Post a Comment

About Me

Blog Archive