Binned histogram of timings in log file on command line

https://stackoverflow.com//questions/24002183

20-12-2019
|

Question

To quickly evaluate the timings of various operations from a log file on a linux server, I would like to extract them from the log and create a textual/tsv style histogram. To have a better idea of how the timings are distributed, I want to bin them into ranges of 0-10ms, 10-20ms etc.

The output should look something like this:

How to achieve this with the usual set of unix command line tools?

Solution

Quick answer:

cat <file> | egrep -o [0-9]+ | sed "s/$/ \/10*10/" | bc | sort -n | uniq -c

Detailed answer:

grep the pattern of your timing or number. You may need to do multiple grep steps to extract exactly the numbers you want from your logs.
use sed to add arithmetic expression for integer division by desired factor and multiply it back on
bc performs the calculation
the well-known sort | uniq combo to count occurrences

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow