AWK Programm die durchschnittliche Niederschlagsmenge von drei Zuständen finden

https://stackoverflow.com/questions/3950778

awk
gawk

08-10-2019
|

Frage

Ich mag die durchschnittliche Niederschlagsmenge von irgendwelchen drei Staaten finden sagen CA, TX und AX für einen bestimmten Monat von Januar bis Dezember Gegeben Eingabedatei begrenzt durch TAB SPACES und hat das Format city name, the state , and then average rainfall amounts from January through December, and then an annual average for all months. Zum Beispiel kann wie folgt aussehen

AVOCA   PA  30  2.10    2.15    2.55    2.97    3.65    3.98    3.79    3.32     3.31   2.79    3.06    2.51    36.18
BAKERSFIELD CA  30  0.86    1.06    1.04    0.57    0.20    0.10    0.01    0.09    0.17    0.29    0.70    0.63    5.72

Was soll ich tun, ist, „Um die Summe der durchschnittlichen Niederschläge zu erhalten für sagen einen bestimmten Monat Februar, über sagen n Jahre und dann seine Durchschnitt finden für die Staaten CA, TX und AX.

Ich habe die unten Skript in awk geschrieben, das gleiche zu tun, aber es gibt mir nicht die erwartete Ausgabe

/^CA$/ {CA++; CA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only 
/^TX$/ {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only  
/^AX$/ {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only 
END {
     CA_avg = CA_SUM/CA;
     TX_avg = TX_SUM/TX;
     AX_avg = AX_SUM/AX; 
     printf("CA Rainfall: %5.2f",CA_avg);
     printf("CA Rainfall: %5.2f",TX_avg);
     printf("CA Rainfall: %5.2f",AX_avg);
    }

rufe ich das Programm mit dem Befehl awk 'FS="\t"'-f awk1.awk rainfall.txt und sehen keine Ausgabe.

Frage: Wo bin ich Abrutschen? Alle Vorschläge und eine geänderte Code wird erkannt werden

Lösung

your regexp should be

/ CA / {CA++; cA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only 
/ TX / {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only  
/ AX / {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only

/^AX$/ match only if it is the only word in the line

HTH!

EDIT

/ CA / {CA++; CA_SUM+= $5} # ^CA$ - Regular Expression to match the word CA only 
/ TX / {TX++; TX_SUM+= $5} # ^TX$ - Regular Expression to match the word TX only  
/ AX / {AX++; AX_SUM+= $5} # ^AX$ - Regular Expression to match the word AX only 
END {

 if(CA!=0){CA_avg = CA_SUM/CA;     printf("CA Rainfall: %5.2f",CA_avg);}
 if(TX!=0){TX_avg = TX_SUM/TX;     printf("TX Rainfall: %5.2f",TX_avg);}
 if(AX!=0){TX_avg = AX_SUM/CA;     printf("AX Rainfall: %5.2f",AX_avg);}
}

Andere Tipps

The pattern /^CA$/ means the characters "C" and "A" are the only characters on the line. You want:

$2 == "CA" {CA++; CA_SUM+= $5}
# etc.

However, this is DRYer:

{ count[$2]++; sum[$2] += $5 }
END {
    for (state in count) {
        printf("%s Rainfall: %5.2f\n", state, sum[state]/count[state])
    }
}

Also, this looks wrong: awk 'FS="\t"'-f awk1.awk rainfall.txt
try: awk -F '\t' -f awk1.awk rainfall.txt

Response to comments:

awk -F '\t' -v month=2 -v states="CA,AZ,TX" '
    BEGIN {
        month_col = month + 3  # assume January is month 1
        split(states, wanted_states, /,/)
    }
    { count[$2]++; sum[$2] += $month_col }
    END {
        for (state in wanted_states) {
            if (state in count) {
                printf("%s Rainfall: %5.2f\n", state, sum[state]/count[state])
            else
                print state " Rainfall: no data"
        }
    }
' rainfall.txt

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow