For better alignment and variants calling (if any), we may need to trim the fastq files, only keep the bases in the middle of cycles, which generally have good and consistent qualities.
One method is trimming by its base quality, which can be done using tools like sickle. Or we can just hard trim all reads , which means just removing N tail (or head) bases and corresponding qualities.
A typical FATSQ file:
@READ_NAME
NGGAAATGGCGTCTGGCGGCGAGATAATGG
+
#1=DDFFFHGHHHIIGIIJJJJJJIIJGJIIHFDD?BDB
Let us say we have pair-end fastq files: "A_1.fastq.gz" and "A_2.fastq.gz"
#trim the 10 tail bases
zcat A_1.fastq.gz | awk --posix '{ if (NR % 2 == 0) { sub(/.{10}$/,""); print} else {print}}' | gzip > A_1.fq.gz
#trim the 10 head bases
zcat A_1.fastq.gz | awk --posix '{ if (NR % 2 == 0) { sub(/^.{10}/,""); print} else {print}}' | gzip > A_1.fq.gz
#trim the 10 head bases and 10 tail bases
zcat A_1.fastq.gz | awk --posix '{ if (NR % 2 == 0) { sub(/^.{10}/,""); sub(/.{10}$/,""); print} else {print}}' | gzip > A_1.fq.gz