Due to the lacking of true variants from real genome sequencing, many tools use NA12878 for benchmarking.
For whole genome sequencing at 50x depth, we can download these two files:
ftp.sra.ebi.ac.uk/vol1/fastq/ERR194/ERR194147/ERR194147_1.fastq.gz
ftp.sra.ebi.ac.uk/vol1/fastq/ERR194/ERR194147/ERR194147_2.fastq.gz
Some statistics of these two filesFacts | ERR194147_1.fastq.gz | ERR194147_2.fastq.gz |
---|---|---|
URI | ERR194147_1.fastq.gz | ERR194147_2.fastq.gz |
Size | 48G | 49G |
Reads# | 787265109 | 787265109 |
Reads Length | 101 | 101 |
Coverage | 101*787265109/3000000000=26.5 | 101*787265109/3000000000=26.5 |
No comments:
Post a Comment