Here's another all awk answer. Create the following executable awk file:
#!/usr/bin/awk -f
BEGIN {DELIM=","; OFS="\t"} # DELIM should just be different than FS/data
# reformat input, set up some arrays
NR==FNR {
line = $1 OFS $2 OFS $3 # replace with $0 if first file is tab delimited
if(FNR==1) header=line
else { a[$2$3]=line; order[FNR-1]=$2$3; cnt++ }
next
}
FILENAME!=last_filename { f[FILENAME]=++fcnt; last_filename=FILENAME }
$2$3 in a { a[$2$3]=a[$2$3] DELIM FILENAME }
# loop over lines in input file, adjusting formatting of lines in a[] with f[]
END {
print header
for(i=1;i<=cnt;i++) {
split(a[order[i]], oarr, DELIM)
printf( "%s", oarr[1] )
k=2
for(j=1;j<=fcnt;j++) {
fname=oarr[k]
if( f[fname]==j ) {o=fname; k++}
else o=""
printf( "%s%s", OFS, o )
}
print ""
}
}
When put into a file called awko
it can be run like awko infile set*
:
day start stop
1 100 102 set1 set2
1 300 350 set2
2 100 200
3 200 400 set2
The generic breakdown:
- store the first file in some arrays, variables
- create an array of files being tested in argument order - used for alignment
- append any matched file names to the matched line in
a[]
- at the end, print out each line in
a[]
in order, reformatting to align matches
The line
variable exists because the data in the question lost it's tabs in translation.