Question

I am very, very much a beginner with NAWK (or AWK) but I know that you can check for a substring value using:

nawk '{if (substr($0,42,4)=="ABCD") {print {$0}}}' ${file}

(This is being run through UNIX, hence the '$0'.)

What if the string could be either ABCD or MNOP? Is there an easy way to code this as a one-liner? I've tried looking but so far only found myself lost...

Was it helpful?

Solution 3

Assuming your values are not regex metacharacters, you could say:

nawk 'substr($0,42,4)~/ABCD|MNOP/' ${file}

If the values contain metacharacters ([, \, ^, $, ., |, ?, *, +, (, )), then you'd need to escape those with a \.

OTHER TIPS

For example with:

nawk 'substr($0,42,4)=="ABCD" || substr($0,42,4)=="MNOP"' ${file}

Note your current command does have some unnecessary parts that awk handles implicitly:

nawk '{if (substr($0,42,4)=="ABCD") {print {$0}}}' ${file}

{print {$0}} is the default awk action, so it can be skipped, as well as the if {} condition. All together, you can let it be like

nawk 'substr($0,42,4)=="ABCD"' ${file}

For more reference you can check Idiomatic awk.

Test

$ cat a
hello this is me
hello that is me
hello those is me

$ awk 'substr($0,7,4)=="this"' a
hello this is me

$ awk 'substr($0,7,4)=="this" || substr($0,7,4)=="that"' a
hello this is me
hello that is me

If you have a large list of possible valid values, you can declare an array, then check to see if that substring is in the array.

nawk '
    BEGIN { valid["ABCD"] = 1 
            valid["MNOP"] = 1
            # ....
    }
    substr($0,42,4) in valid
' file

One thing to remember: the in operator looks at an associative array's keys, not the values.

You said "string" not "RE" so this is the approach to take for a string comparison against multiple values:

awk -v strs='ABCD MNOP' '
BEGIN {
    split(strs,tmp)
    for (i in tmp)
        strings[tmp[i]]
}
substr($0,42,4) in strings
' file
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top