Question

My issue is with a bash script I am writing. I need to specify input files for a tool called STAR, which is an aligner for bioinformatic use. It has a flag --readFilesIn. In my case this takes two sets of multiple files (fastq files) comma separated, the two sets separated by space; the input looks like:

STAR [OPTIONS] --readFilesIn fq_r1_1,fq_r1_2,fq_r1_3 fq_r2_1,fq_r2_2,fq_r2_3

Because each set of fastq files differs in number and name I had thought of using below, where $files is the dir containing the fastq files.

dir $files | cut -d " " -f 9 | sed 's/\s *//g' | grep _r1_ | paste -sd ","

This produces the correct format, but the aligner doesn't accept the input. I have tried writing to file and giving it this but that doesn't work either. What is the convention for bash in this instance?

Many thanks in advance,

Bruce.

How can I make the aligner accept this. It wants it as in my first example.

Was it helpful?

Solution

Will the following work?

group1=( fq_r1_* )
group2=( fq_r2_* )
( IFS=,; STAR [OPTIONS] --readFilesIn "${group1[*]}" "${group2[*]}" )

(Running the command inside a subshell to avoid the need to preserve the current value, if any, of IFS).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top