Question

I'm trying to extract a specific value (e.g userAgent in this case) from bunch of .gz log files which are essentially compressed log files. The format of each log statement in these log files looks like this :

2013-06-20;02:00:02.503 [664492205@qtp-446095113-8883]-Activity [response@12293 appId=testApp userAgent=BundleDeviceFamily/iPhone,iPad (iPad; iPad2,5; iPad2,5; iPhone OS 6.1.3) EXEC_TM=123  FLOW=response TOKN_TM=0 GW_TM=2314.529 http.status=200 id=029dde45-802c-462a-902b-138bc5490fba offeringId=iPad httpUrl= test.com AUD_TM=0 ipAddress=10.10.10.10 ]\

2013-06-20;02:00:02.504 [664492205@qtp-446095113-8883]-Activity [response@12293 appId=testApp userAgent=FNetwork/609.1.4 Darwin/13.0.0 id=029dde45-802c-462a-902b-138bc5490fba EXEC_TM=123  FLOW=response TOKN_TM=0 GW_TM=2314.529 http.status=200  offeringId=iPad httpUrl= test.com AUD_TM=0 ipAddress=10.10.10.10 ]

In this case, I want to extract userAgent field and display the result either in one of the below formats:

userAgent=BundleDeviceFamily/iPhone,iPad (iPad; iPad2,5; iPad2,5; iPhone OS 6.1.3)
userAgent=FNetwork/609.1.4 Darwin/13.0.0

and so on..

OR print just the values such as :

BundleDeviceFamily/iPhone,iPad (iPad; iPad2,5; iPad2,5; iPhone OS 6.1.3)
FNetwork/609.1.4 Darwin/13.0.0

EDIT : Just to add more info, that these space seperated fields such as key1=value1 key2=value2 could appear in any order

Appreciate the help. Thanks!

Was it helpful?

Solution 2

Since you mentioned that key=value pairs can appear in any order, here is one way of doing it with awk.

zcat input.gz | awk -F= '
{
  for(i=1;i<=NF;i++) {
    if($i~/userAgent/) { 
      sub(/[^ ]+$/,"",$(i+1))
      print "userAgent="$(i+1) 
    }
  }
}' 

OTHER TIPS

Using + :

zcat input.gz | sed -n 's/.*\(userAgent=[^=]*\) [^ =][^ =]*=.*/\1/p'

also can be a little shorter with -:

zcat input.gz | sed -n 's/.*\(userAgent=[^=]*\) [^ =]\+=.*/\1/p'

and some , combo:

zcat input.gz | grep -o 'userAgent=[^=]*' | sed 's/ [^ ]*$//'

and can be combined in a (thanks lhf):

zgrep -o 'userAgent=[^=]*' input.gz | sed 's/ [^ ]*$//'
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top