Question

One of the columns in my file is url encoded, I have to decode that column and need to perform some operations based on values inside the column. Is there any way I can decode that column in awk?

Was it helpful?

Solution

You have to adapt it depending your file format, but the basic principle is here (tested with GNU Awk 3.1.7):

sh$ echo 'Hello%2C%20world%20%21' | awk '
     {
         for (i = 0x20; i < 0x40; ++i) {
             repl = sprintf("%c", i);
             if ((repl == "&") || (repl == "\\"))
                 repl = "\\" repl;
             gsub(sprintf("%%%02X", i), repl);
             gsub(sprintf("%%%02x", i), repl);
         }
         print
     }
 '
Hello, world !

If you have gawk, you can wrap that in a function (credit to brendanh in a comment below):

function urlDecode(url) {
    for (i = 0x20; i < 0x40; ++i) {
        repl = sprintf("%c", i);
        if ((repl == "&") || (repl == "\\")) {
            repl = "\\" repl;
        }
        url = gensub(sprintf("%%%02X", i), repl, "g", url);
        url = gensub(sprintf("%%%02x", i), repl, "g", url);
    }
    return url;
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top