Question

To quickly evaluate the timings of various operations from a log file on a linux server, I would like to extract them from the log and create a textual/tsv style histogram. To have a better idea of how the timings are distributed, I want to bin them into ranges of 0-10ms, 10-20ms etc.

The output should look something like this:

121    10
 39    20
 12    30
  7    40
  1   100

How to achieve this with the usual set of unix command line tools?

Was it helpful?

Solution

Quick answer:

cat <file> | egrep -o [0-9]+ | sed "s/$/ \/10*10/" | bc | sort -n | uniq -c

Detailed answer:

  • grep the pattern of your timing or number. You may need to do multiple grep steps to extract exactly the numbers you want from your logs.

  • use sed to add arithmetic expression for integer division by desired factor and multiply it back on

  • bc performs the calculation

  • the well-known sort | uniq combo to count occurrences

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top