Running the date
command once per line for millions of lines is going to be painfully slow. Anything that avoids that is going to be faster. One answer has suggested sed
— that has many merits; another suggested Perl — ditto.
Using awk
, you could look at:
awk 'BEGIN { m["Jan"] = "01"; m["Feb"] = "02"; m["Mar"] = "03";
m["Apr"] = "04"; m["May"] = "05"; m["Jun"] = "06";
m["Jul"] = "07"; m["Aug"] = "08"; m["Sep"] = "09";
m["Oct"] = "10"; m["Nov"] = "11"; m["Dec"] = "12";
}
{
printf "2014%s%02d,%s,", m[$1], $2, $3;
pad=""
for (i = 4; i <= NF; i++) { printf("%s%s", pad, $i); pad = " " }
printf "\n"
}
' log-file
If you have GNU awk
, it has time manipulation functions built in, though frankly treating the date information as strings and numbers as shown is quite as effective.
Given an input log file like this:
Apr 22 23:08:26 a,x,y
Apr 22 23:08:26 b,y,z
Apr 22 23:08:26 c,s,s
Jan 31 00:19:50 c,info with spaces,some more info
Feb 2 00:20:41 c,info with spaces,some more info
Mar 13 00:31:32 c,info with spaces,some more info
May 5 00:42:23 c,info with spaces,some more info
Jun 16 00:53:14 c,info with spaces,some more info
Jul 27 00:04:05 c,info with spaces,some more info
Aug 8 00:15:56 c,info with spaces,some more info
Sep 29 00:26:47 c,info with spaces,some more info
Oct 30 00:37:38 c,info with spaces,some more info
Nov 12 00:49:29 c,info with spaces,some more info
Dec 22 00:50:10 c,info with spaces,some more info
It generates output like this:
20140422,23:08:26,a,x,y
20140422,23:08:26,b,y,z
20140422,23:08:26,c,s,s
20140131,00:19:50,c,info with spaces,some more info
20140202,00:20:41,c,info with spaces,some more info
20140313,00:31:32,c,info with spaces,some more info
20140505,00:42:23,c,info with spaces,some more info
20140616,00:53:14,c,info with spaces,some more info
20140727,00:04:05,c,info with spaces,some more info
20140808,00:15:56,c,info with spaces,some more info
20140929,00:26:47,c,info with spaces,some more info
20141030,00:37:38,c,info with spaces,some more info
20141112,00:49:29,c,info with spaces,some more info
20141222,00:50:10,c,info with spaces,some more info