Question

I am going to COUNT the production API usage from backend access log, the access log looks like -

"GET /service1/api1?querystr1=11.."
"GET /service1/api2?querystr2=22.."
"GET /service1/api2?querystrx=xx.."
"GET /service1/api3?querystry=zz.."
"GET /service1/api3?querystr1=33.."
"GET /service1/api3?querystr3=55.."

So the expect result of the searching is:

/service1/api1  - 1
/service1/api2  - 2
/service1/api3  - 3

I know using the following command can get all service1 URL, but I do not know how to count API for matched string.

egrep '"GET /service1/.*' myaccesslogs

Your help will be appreciated, thanks.


Let me add one subsquent question:

Log:

/service1/first.do?action=doTask&type=taskA&xx=yy&zz=dd&

condition:

grep -o /service1/first.do?action=doTask&.*&

expect: /service1/first.do?action=doTask&type=taskA

actual: /service1/first.do?action=doTask&type=taskA&xx=yy&zz=dd&

I tried:

egrep -o /service1/first.do?action=doTask&.*?&

But nothing matched, looks like non-greedy (lazy) does not work for me. So then what is the correct condition?

Answer: Adding -P works perfectly

grep -Po /service1/first.do?action=doTask&.*?&
Was it helpful?

Solution

The -o param to grep makes sure that only the match is printed. Then we sort it so all like apis are consecutive because uniq will treat them separately if they're not. uniq -c prints the count and entry for consecutive unique entries.

cat my.log | grep -o "GET /service1/api." | sort | uniq -c

Output

1 GET /service1/api1
2 GET /service1/api2
3 GET /service1/api3

OTHER TIPS

Try the below command,

$ sed 's/"\(.*\)?.*/\1/g' file | awk '{count[$2]++} END{ for (ct in count) { print ct," - ",count[ct]}}' 
/service1/api1  -  1
/service1/api2  -  2
/service1/api3  -  3

try to use wc command like this:

egrep '"GET /service1/.*' myaccesslogs|wc -l

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top