Question

I have a couple processes running a tool I've written that are joined by pipes, and I would like to measure their collected memory usage with valgrind. So far, I have tried something like:

$ valgrind tool=massif trace-children=yes --peak-inaccuracy=0.5 --pages-as-heap=yes --massif-out-file=myProcesses".%p" myProcesses.script

Where myProcesses.script runs the equivalent of my tool foo twice, e.g.:

foo | foo > /dev/null

Valgrind doesn't seem to capture the collected memory usage of this the way I expect. If I use top to track this, I get (for sake of argument) 10% memory usage on the first foo, and then another 10% collects on the second foo before the myProcesses.script completes. This is the sort of thing I want to measure: the usage of both processes. Valgrind instead returns the following error:

Massif: ms_main.c:1891 (ms_new_mem_brk): Assertion 'VG_IS_PAGE_ALIGNED(len)' failed.

Is there a way to collect memory usage data for commands I'm using in a piped fashion (using valgrind)? Or a similar tool that I can use to accurately automate these measurements?

The numbers that top returns while polling seem hand-wavy, to me, and I am seeking accurate and repeatable measurements. If you have suggestions for alternative tools, I would welcome those, as well.

EDIT - Fixed typo with valgrind option.

EDIT 2 - For some reason, it appears that the option --pages-as-heap is giving us troubles with the binaries we're testing. Your examples run fine. A new page is created every time we enter a non-inlined function (stack overflows - heh). We wanted to count those, but they're relatively minor in the scale of memory usage we're testing. (Perhaps there aren't function calls in ls or less?) Removing --pages-as-heap helped get testing working again. Thanks to MrGomez for the great help.

Was it helpful?

Solution

With the correct valgrind version given in the errata, this seems to just work for me in Valgrind 3.6.1. My invocation:

<me>@harley:/tmp/test$ /usr/local/bin/valgrind --tool=massif \
  --trace-children=yes --peak-inaccuracy=0.5 --pages-as-heap=yes \
  --massif-out-file=myProcesses".%p" ./testscript.sh
==21067== Massif, a heap profiler
==21067== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21067== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21067== Command: ./testscript.sh
==21067==
==21068== Massif, a heap profiler
==21068== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21068== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21068== Command: /bin/ls
==21068==
==21070== Massif, a heap profiler
==21070== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21070== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21069== Massif, a heap profiler
==21069== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
==21069== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
==21069== Command: /bin/sleep 5
==21069==
==21070== Command: /usr/bin/less
==21070==
==21068==
(END) ==21069==
==21070==
==21067==

The contents of my test script, testscript.sh:

ls | sleep 5 | less

Sparse contents from one of the files generated by --massif-out-file=myProcesses".%p" (myProcesses.21055):

desc: --peak-inaccuracy=0.5 --pages-as-heap=yes --massif-out-file=myProcesses.%p
cmd: ./testscript.sh
time_unit: i
#-----------
snapshot=0
#-----------
time=0
mem_heap_B=110592
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=1
#-----------
time=0
mem_heap_B=118784
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
...
#-----------
snapshot=18
#-----------
time=108269
mem_heap_B=1708032
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=peak
n2: 1708032 (page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc.
 n3: 1474560 0x4015E42: mmap (mmap.S:62)
  n1: 1425408 0x4005CAC: _dl_map_object_from_fd (dl-load.c:1209)
   n2: 1425408 0x4007109: _dl_map_object (dl-load.c:2250)
    n1: 1413120 0x400CEEA: openaux (dl-deps.c:65)
     n1: 1413120 0x400D834: _dl_catch_error (dl-error.c:178)
      n1: 1413120 0x400C1E0: _dl_map_object_deps (dl-deps.c:247)
       n1: 1413120 0x4002B59: dl_main (rtld.c:1780)
        n1: 1413120 0x40140C5: _dl_sysdep_start (dl-sysdep.c:243)
         n1: 1413120 0x4000C6B: _dl_start (rtld.c:333)
          n0: 1413120 0x4000855: ??? (in /lib/ld-2.11.1.so)
    n0: 12288 in 1 place, below massif's threshold (01.00%)
  n0: 28672 in 3 places, all below massif's threshold (01.00%)
  n1: 20480 0x4005E0C: _dl_map_object_from_fd (dl-load.c:1260)
   n1: 20480 0x4007109: _dl_map_object (dl-load.c:2250)
    n0: 20480 in 2 places, all below massif's threshold (01.00%)
 n0: 233472 0xFFFFFFFF: ???
#-----------
snapshot=19
#-----------
time=108269
mem_heap_B=1703936
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=20
#-----------
time=200236
mem_heap_B=1839104
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty

Massif continues to complain about heap allocations in the remainder of my files. Note this is very similar to your error.

I theorize that your version of valgrind was built in debug mode, causing the asserts to fire. A rebuild from source (I used this with the defaults hanging off ./configure) will fix the issue.

Either way, this seems to be expected with Massif.

OTHER TIPS

Some programs allow you to preload the libmemusage.so library and get a report of what memory allocations were allocated recorded:

$ LD_PRELOAD=libmemusage.so less /etc/passwd

Memory usage summary: heap total: 36212, heap peak: 35011, stack peak: 15008
         total calls   total memory   failed calls
 malloc|         39           5985              0
realloc|          3             64              0  (nomove:2, dec:0, free:0)
 calloc|        238          30163              0
   free|         51          11546
Histogram for block sizes:
    0-15            128  45% ==================================================
   16-31             13   4% =====
   32-47            105  37% =========================================
   48-63              2  <1% 
   64-79              4   1% =
   80-95              5   1% =
   96-111             3   1% =
  112-127             3   1% =
  160-175             1  <1% 
  192-207             1  <1% 
  208-223             2  <1% 
  256-271             1  <1% 
  432-447             1  <1% 
  560-575             1  <1% 
  656-671             1  <1% 
  768-783             1  <1% 
  944-959             1  <1% 
 1024-1039            2  <1% 
 1328-1343            1  <1% 
 2128-2143            1  <1% 
 3312-3327            1  <1% 
 7952-7967            1  <1% 
 8240-8255            1  <1% 

Though I must admit that it doesn't always work -- LD_PRELOAD=libmemusage.so ls never reports anything, for example -- and I wish I knew the conditions that allow it to work or not work.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top