Due to the lacking of true variants from real genome sequencing, many tools use NA12878 for benchmarking.
For whole genome sequencing at 50x depth, we can download these two files:
ftp.sra.ebi.ac.uk/vol1/fastq/ERR194/ERR194147/ERR194147_1.fastq.gz
ftp.sra.ebi.ac.uk/vol1/fastq/ERR194/ERR194147/ERR194147_2.fastq.gz
Some statistics of these two files| Facts | ERR194147_1.fastq.gz | ERR194147_2.fastq.gz | 
|---|---|---|
| URI | ERR194147_1.fastq.gz | ERR194147_2.fastq.gz | 
| Size | 48G | 49G | 
| Reads# | 787265109 | 787265109 | 
| Reads Length | 101 | 101 | 
| Coverage | 101*787265109/3000000000=26.5 | 101*787265109/3000000000=26.5 | 
No comments:
Post a Comment