Question

I have a large file where the fourth column contains decimal based values. How can I calculate the percentage of records that are < 2 ?

test test test .001
test2 test2 test2 8.0993
test3 test3 test3 .00001
test4 test4 test4 1.001
Was it helpful?

Solution

This awk script will do it:

awk '{$4<2?l++:g++}END{printf "< 2 : %s %\n", l*100/(g+l); printf ">=2 : %s %", g*100/(g+l)}' your.file 

For better readability place it a file:

percent.awk:

{ $4<2 ? l++ : g++ }
END {
    printf "<  2 : %s %\n", l*100/(g+l)
    printf ">= 2 : %s %", g*100/(g+l)
}

and execute it like:

awk -f percent.awk your.file 
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top