How about just submitting your job to your grid engine, it will handle the parallelization itself :
#! /bin/sh
### Your script task.sh ###
#$ -S /bin/sh
bwa-0.7.5a/bwa mem -t 8 human_g1k_v37.fasta "${1}_R1.fastq.gz" "${1}_R2.fastq.gz" > "$destdir/${1}_R1_R2.sam"
and then in your loop :
for fname in *_R1.fastq.gz
do
base=${fname%_R1*}
qsub task.sh ${base}
done