Question

emphasized textI have some text like

CreateMainPageLink("410",$objUserData,$mnt[139]);

from which i want to extract the number 139 after the occurrence of mnt with gawk. I tried the following expression (within a pipe expression to be used on a result of a grep)

gawk '{FS="[\[\]]";print NF}'

to print the number of fields. If my field separators were [ and ] I expect to see the number 3 printed out (three fields; one before the opening rectangular bracket, one after, and the actual number I want to extract). What I get instead is one field, corresponding to the full line, and two warnings:

gawk: warning: escape sequence `\[' treated as plain `['
gawk: warning: escape sequence `\]' treated as plain `]'

I was following the example given here, but obviously there is some problem/error with my expression.

Using the following two expressions also do not work:

gawk '{FS="[]"}{print NF;}'
gawk: (FILENAME=- FNR=1) fatal: Unmatched [ or [^: /[]/

and

gawk '{FS="\[\]"}{print NF;}'
gawk: warning: escape sequence `\[' treated as plain `['
gawk: warning: escape sequence `\]' treated as plain `]'
gawk: (FILENAME=- FNR=1) fatal: Unmatched [ or [^: /[]/
Was it helpful?

Solution

gawk -F[][] '{ print $0" -> "$1"\t"$2; }'

$ gawk -F[][] '{ print $0" -> "$1"\t"$2; }'
titi[toto]tutu
titi[toto]tutu -> titi  toto

1) You must set the FS before entering the main parsing loop. You could do:

awk  'BEGIN { FS="[\\[\\]]"; } { print $0" -> "$1"\t"$2; }'

Which executes the BEGIN clause before parsing the file.

I have to escape the [character twice: one because it is inside a quoted string. And another once because gawk mandate it inside a bracket expression.

I personnaly prefer to use the -F flag which is less verbose.

2) FS="[\[\]]" is wrong, because you are inside a quoted string, this escape the character inside the string. Awk will see: [[]] which is an invalid bracket expression.

3) FS="[]" is wrong because it is an empty bracket expression trying to match nothing

4) FS="\[\]" is wrong again because it is error 2) and 3) together :)

gawk manual says: The regular expressions in awk are a superset of the POSIX specification. This is why you can use either: [\\[\\]] or [][]. The later being the posix way.

To include a literal ']' in the list, make it the first character

See:

OTHER TIPS

FS="[]" Here it looks for data inside the [] and there are none.

To use square brackets you need to write them like this [][]

This is also wrong gawk '{FS="[\[\]]";print NF}' you need FS as a variable outside expression.

Eks

echo 'CreateMainPageLink("410",$objUserData,$mnt[139]);' | awk -F[][] '{print $2}'
139

Or

awk  '{print $2}' FS=[][]

Or

awk 'BEGIN {FS="[][]"} {print $2}'

All gives 139

Edit: gawk '{FS="[\[\]]";print NF}' Here you print number of fields NF and not value of it $NF. Anyway it will not help, since dividing your data with [] gives ); as last filed, use this awk '{print $(NF-1)}' FS=[][] to get second last filed.

Do you need awk? You can get the value via sed like this:

 # echo 'CreateMainPageLink("410",$objUserData,$mnt[139]);' | sed -n 's:.*\[\([0-9]\+\)\].*:\1:p'
 139
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top