質問

I have two files. 1st line being the header line.

File1

start end
234 789
678 780
125 457
534 988

File2

start end abc efg hij klm nmo
234 789 NA NA 01 02 NA
678 780 01 NA NA NA NA
125 457 NA 01 01 NA 02
534 988 NA NA NA NA 02

Now I want to compare these two files; coloumn1 and column2 of File1 with Column1 and Column2 of File2. If they match, I want to print a third file with column1 and column2 of File2 and then the header of the columns for which the field character is not equal to 'NA' like the following output file

start end
234 789 hij, klm
678 780 abc
125 457 efg, hij, nmo
534 988 nmo

I only know to compare lines; but dont know will it be possible to print the headers which dont match the pattern 'NA'.

役に立ちましたか?

解決

You could try awk -f a.awk file2 file1, where a.awk is:

NR==FNR {
    if (NR==1) {
        split($0,b)
        next
    }
    s="";
    for (i=3; i<=NF; i++) {
        if ($i!="NA") {
            if (s) 
                s=s", " b[i]
            else
                s=b[i]
        }
    }
    a[$1,$2]=s
    next
}
FNR==1 {next}
($1,$2) in a {
    print $1,$2,a[$1,$2]
}

Output:

234 789 hij, klm
678 780 abc
125 457 efg, hij, nmo
534 988 nmo

他のヒント

Here's one way using awk. Run like:

awk -f ./script.awk File1 File2 > File3

Contents of script.awk:

NR==1 {

    h=$0
    next
}

FNR==NR {

    a[$1,$2]
    next
}

FNR==1 {

    split($0, b)
    print h

    next
}

($1,$2) in a {

    for (i=3;i<=NF;i++) {

        c = ($i != "NA" ? b[i] : "")

        if (c) {

            r = (r ? r ", " : "") c
        }
    }

    print $1, $2, r
    r = c = ""
}

Results and contents of File3:

start end
234 789 hij, klm
678 780 abc
125 457 efg, hij, nmo
534 988 nmo
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top