Question

I have two files. 1st line being the header line.

File1

start end
234 789
678 780
125 457
534 988

File2

start end abc efg hij klm nmo
234 789 NA NA 01 02 NA
678 780 01 NA NA NA NA
125 457 NA 01 01 NA 02
534 988 NA NA NA NA 02

Now I want to compare these two files; coloumn1 and column2 of File1 with Column1 and Column2 of File2. If they match, I want to print a third file with column1 and column2 of File2 and then the header of the columns for which the field character is not equal to 'NA' like the following output file

start end
234 789 hij, klm
678 780 abc
125 457 efg, hij, nmo
534 988 nmo

I only know to compare lines; but dont know will it be possible to print the headers which dont match the pattern 'NA'.

Était-ce utile?

La solution

You could try awk -f a.awk file2 file1, where a.awk is:

NR==FNR {
    if (NR==1) {
        split($0,b)
        next
    }
    s="";
    for (i=3; i<=NF; i++) {
        if ($i!="NA") {
            if (s) 
                s=s", " b[i]
            else
                s=b[i]
        }
    }
    a[$1,$2]=s
    next
}
FNR==1 {next}
($1,$2) in a {
    print $1,$2,a[$1,$2]
}

Output:

234 789 hij, klm
678 780 abc
125 457 efg, hij, nmo
534 988 nmo

Autres conseils

Here's one way using awk. Run like:

awk -f ./script.awk File1 File2 > File3

Contents of script.awk:

NR==1 {

    h=$0
    next
}

FNR==NR {

    a[$1,$2]
    next
}

FNR==1 {

    split($0, b)
    print h

    next
}

($1,$2) in a {

    for (i=3;i<=NF;i++) {

        c = ($i != "NA" ? b[i] : "")

        if (c) {

            r = (r ? r ", " : "") c
        }
    }

    print $1, $2, r
    r = c = ""
}

Results and contents of File3:

start end
234 789 hij, klm
678 780 abc
125 457 efg, hij, nmo
534 988 nmo
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top