awk or sed sum similar value in rows [duplicate]

https://stackoverflow.com/questions/23532337

17-07-2023
|

Question

I have a file with values such as:

(X1 55) (X2 99) (X3 29) (X1 3) (X3 10)
(X1 21) (X3 11) (X1 9)

Is there a way to add the values by the Xn names in each row:

(X1 58) (X2 99) (X3 39)
(X1 30) (X3 11)

I'm not sure which is best to use, awk, sed or...? I tried this:

awk '{for (i=t=0;i<NF;) t+=$++i; $0=t}1' file

196
41

It obviously sums all values together, so maybe it's a bit more complex.

Solution

$ awk '{
    for (i=1;i<NF;i+=2) {
        sum[$i]+=$(i+1)
    }
    ofs = ""
    for (key in sum) {
        printf "%s%s %d)", ofs, key, sum[key]
        delete sum[key]
        ofs = OFS
    }
    print ""
}' file
(X2 99) (X3 39) (X1 58)
(X3 11) (X1 30)

If you care about the order of the fields, there's various ways to keep the original order...

OTHER TIPS

Here you are:

echo '(X1 55) (X2 99) (X3 29) (X1 3) (X3 10)' | sed 's/[()]//g' | awk '{for( i=1; i<NF; i+=2) a[$i]+=$(i+1);} {for (keys in a ) print keys, a[keys];}'

output:

X1 58
X2 99
X3 39

I think that's close enough?

I'd go with manipulating FS and RS to closely match the data. awk's array traversal is random unless you are using gawk >= 4.0 and set PROCINFO["sorted_in"] to specify they method of traversal.

$ awk '
    NF{ x[$1] += $2 }
    2==NF{
        for (i in x) { printf "%s(%s %s)", sep, i, x[i]; sep = " "; }
        print "";
        split("",x); sep = "";
    }
    ' RS='[\\n\\(]' FS='[ \\n]' /tmp/file.txt
(X1 58)  (X2 99)  (X3 39)
(X1 30)  (X3 11)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow