Question

I am trying to understand some commands in awk, which are related to If statement inside awk to change a value question.

The input file is

...
MODE P E
IMP:P 1 19r 0
IMP:E 1 19r 0
...
SDEF POS= 0 0 14.6 AXS= 0 0 1 EXT=d3 RAD= d4 cell=23 ERG=d1 PAR=2
SI1 L  0.020
SP1    1
SI4 0. 3.401                                                                    
SI3 0.9 
...
NPS 20000000

and the code is

#! /bin/bash

vals=(0.02 0.04)

for val in "${vals[@]}"; do
awk -v val="$val" '
  BEGIN { i=1; split (val,v," ") }
  # If it is a string, find the sequence SI1 L and change the value after that, using values that the user inputs
  /SDEF POS.*ERG=[a-zA-Z]+/ { flag="y" ; }
  /SI1 L/ { if (flag=="y") { $3=v[i]; i++; flag="n"; } }
  # If it is a number, change the number using values that the user inputs.
  /SDEF POS.*ERG=[0-9]+ / { sub(/ERG=[0-9]*/, "ERG="v[i],$0);i++; }
  1
' 20small > "${val}"
done

I am trying to ubderstand the following

  1. Why after [a-z], [A-Z] and [0-9] there is the + sign? I can understand that those commands search for any letter and number after the defined sequence, but I don't seem to understand what does the + do.
  2. What exactly does flag do? I can understand that /<expresion>/ flag="y" "replaces" the <expresion> with y. I can't seem to understand the reason to change the flag a few lines bellow. Is it like a dummy flag?
  3. What exactly does / / do? I know that this declares the search pattern. Why adding SDEF and POS? Just to be sure it's the correct line? I also find confusing the use of .*. Does it mean Find SDEF POS line and on the same line look for ERG no matter what it's between them? What exactly does /SDEF POS.*ERG=[a-zA-z]+/ do?

I am a rookie in awk so I am trying to learn it by example. An help would be more that welcome!

Was it helpful?

Solution

  1. + means one or more repetitions. For example, a+ will match a aa aaa etc. a* match same as +, but also none a

  2. flag is used as a temporary information holder.

  3. / / is regex holder and is used to show that it must match regex inside the two /

I do suggest you look at some awk one liners and try to understand how they work. Also start by reading some awk tutorials. You can find a lot by using Google.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top