Question

I have a list of 3000 or so IP addresses that were the result of a pdsh output piped through dshback -c which formats the output into a readable format. I like the readability of dshback -c, but the problem I have is that IP's with common octets are collapsed to save space. I need to have the full IP address for the rest of my project.

Is there an easy way to convert this input:

192.168.38.[217,222],192.168.40.215,192.168.41.[219-222]

to this output:

192.168.38.217,192.168.38.222,192.168.40.215,192.168.41.219,192.168.41.220,192.168.41.221,192.168.41.222

I was thinking sed could be used directly, but I'm not sure how to store the common octets in a variable. For this reason, I believe a bash script will need to be used along with sed. Any help or points in the right direction would be appreciated.

Was it helpful?

Solution

If you can change an input you can use following form:

echo 192.168.38.{217,222} 192.168.40.215 192.168.41.{219..222} | tr ' ' ','

Otherwise you can change it by command and eval:

eval echo $( echo '192.168.38.[217,222],192.168.40.215,192.168.41.[219-222]' | \
 sed 's/,/ /g;s/\[/{/g;s/]/}/g;s/-/../g;s/\({[0-9]\+\) \([0-9]\+}\)/\1,\2/g' | \
 grep -v '[^0-9{}., ]' ) | tr ' ' ','

note, that eval is pretty dangerous on invalidated data, therefore I use grep '[^0-9{}., ]' to exclude any unexpected symbols. sed in this command just transforms your original string to a form I've mentioned above.

OTHER TIPS

If you are ready to use awk then you can try this

echo "192.168.38.[217,222],192.168.40.215,192.168.41.[219-222]" |sed 's/\[//g' | sed 's/\]//g' | awk -F, '{for(i=1;i<=NF;i++){n=split($i,a,".");IPL="";if(n>1){PIP=a[1] "." a[2] "." a[3];}else{IPL=PIP "." $i;}if(index(a[4],"-") > 0){x=0;split(a[4],b,"-");for(j=b[1];j<=b[2];j++){if(x==0){IPL=PIP "." j;x++;}else{IPL=IPL "," PIP "." j;}}}else if(index(a[4],",") > 0){split(a[4],b,",");IPL=PIP "." b[1] "," PIP "." b[2];}else{if(length(IPL)<=3){IPL=PIP "." a[4];}}printf("%s,",IPL);}}'

If you are interested in using this i can explain the logic...

This is one way to process it purely with Bash as required. No awks, sed and other stuffs.

#!/bin/bash

shopt -s extglob
IFS=,

while read -r LINE; do
    OUTPUT=()
    while [[ -n $LINE ]]; do
        case "$LINE" in
        +([[:digit:]]).+([[:digit:]]).+([[:digit:]]).+([[:digit:]]))
            OUTPUT[${#OUTPUT[@]}]=$LINE
            break
            ;;
        +([[:digit:]]).+([[:digit:]]).+([[:digit:]]).+([[:digit:]]),*)
            OUTPUT[${#OUTPUT[@]}]=${LINE%%,*}
            LINE=${LINE#*,}
            ;;
        +([[:digit:]]).+([[:digit:]]).+([[:digit:]]).\[+([[:digit:],-])\]*)
            SET=${LINE%%\]*}
            PREFIX=${SET%%\[*}
            read -a RANGES <<< "${SET:${#PREFIX} + 1}"
            for R in "${RANGES[@]}"; do
                case "$R" in
                +([[:digit:]]))
                    OUTPUT[${#OUTPUT[@]}]=${PREFIX}${R}
                    ;;
                +([[:digit:]])-+([[:digit:]]))
                    X=${R%%-*} Y=${R##*-}
                    if [[ X -le Y ]]; then
                        for (( I = X; I <= Y; ++I )); do
                            OUTPUT[${#OUTPUT[@]}]=${PREFIX}${I}
                        done
                    else
                        for (( I = X; I >= Y; --I )); do
                            OUTPUT[${#OUTPUT[@]}]=${PREFIX}${I}
                        done
                    fi
                    ;;
                esac
            done
            LINE=${LINE:${#SET} + 2}
            ;;
        *)
            # echo "Invalid token: $LINE" >&2
            break
        esac
    done
    echo "${OUTPUT[*]}"
done

For an input of

192.168.38.[217,222],192.168.40.215,192.168.41.[219-222]

Running bash temp.sh < temp.txt yields

192.168.38.217,192.168.38.222,192.168.40.215,192.168.41.219,192.168.41.220,192.168.41.221,192.168.41.222

It's consistent also with ranges. If X is later than Y e.g. 200-100 then it would generate IPS with subsets of 200 to 100. The script could also process multi-line inputs.

And it should also work with mixed ranges like [100,200-250].

With GNU awk:

$ cat tst.awk
BEGIN{ FS=OFS="," }
{
    $0 = gensub(/(\[[[:digit:]]+),([[:digit:]]+\])/,"\\1+\\2","g")
    gsub(/[][]/,"")

    for (i=1;i<=NF;i++) {
        split($i,a,/\./)
        base  = a[1] "." a[2] "." a[3]
        range = a[4]
        split(range,r,/[+-]/)

        printf (i>1 ? "," : "")
        if (range ~ /+/) {
            printf "%s.%s", base, r[1]
            printf "%s.%s", base, r[2]
        }
        else if (range ~ /-/) {
            for (j=r[1]; j<=r[2]; j++) {
                printf "%s.%s", base, j
            }
        }
        else {
            printf "%s.%s", base, range
        }
    }
    print ""
}
$
$ awk -f tst.awk file
192.168.38.217192.168.38.222,192.168.40.215,192.168.41.219192.168.41.220192.168.41.221192.168.41.222

We need the gensub() to change the comma inside the square brackets to a different character (+) so we can use the comma outside of the brackets as the field separator and gensub() makes it gawk-specific.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top