Question

I've got a memory leak in Upstart init process (pid 1), what options I have on debugging it?

EDIT: Suggest me some real tools for this, manually putting printfs or calculating memory allocations by hand isn't gonna cut it. Also dumping init core and poking around that is not really an option.

UPD1: valgrind doesn't work. Replacing /sbin/init on kernel command line with proper valgrind + init magic doesn't seem to be an option as it tries to access /proc for self for smaps, but those isn't available before init is ran.

UPD2: dmalloc doesn't work either (doesn't compile on ARM).

Was it helpful?

Solution

A poor man's solution would be to just log every call malloc and free, then comb through the logs and look for pattern.

ld provides an amazing feature that could help here.

--wrap=symbol

Use a wrapper function for symbol. Any undefined reference to symbol will be resolved to "__wrap_symbol". Any undefined reference to "__real_symbol" will be resolved to symbol.

This can be used to provide a wrapper for a system function. The wrapper function should be called "__wrap_symbol". If it wishes to call the system function, it should call "__real_symbol".

Here is a trivial example:

void *
__wrap_malloc (size_t c)
{
   printf ("malloc called with %zu\n", c);
   return __real_malloc (c);
}

If you link other code with this file using --wrap malloc, then all calls to "malloc" will call the function "__wrap_malloc" instead. The call to "__real_malloc" in "__wrap_malloc" will call the real "malloc" function.

You may wish to provide a "__real_malloc" function as well, so that links without the --wrap option will succeed. If you do this, you should not put the definition of "__real_malloc" in the same file as "__wrap_malloc"; if you do, the assembler may resolve the call before the linker has a chance to wrap it to "malloc".


Update

Just to be clear on how this is useful.

  • Add a custom file to Upstart's build.

Like this:

void*__wrap_malloc( size_t c )
{
   void *malloced = __real_malloc(c);
   /* log malloced with its associated backtrace*/
   /* something like: <malloced>: <bt-symbol-1>, <bt-symbol-2>, .. */
   return malloced
}

void __wrap_free( void* addr )
{
   /* log addr with its associated backtrace*/
   /* something like: <addr>: <bt-symbol-1>, <bt-symbol-2>, .. */
   __real_free(addr);
}
  • Recompile upstart with debug symbols (-g) so you can get some nice backtraces. You can still optimize (-O2/-O3) the code if you wish.

  • Link Upstart with the extra LD_FLAGS --wrap=malloc, --wrap=free.
    Now anywhere Upstart calls malloc the symbol will be magically resolved to your new symbol __wrap_malloc. Beautifully this is all transparent to the compiled code as it happens at link time.
    It's like shimming or instrumenting with out any of the mess.

  • Run the recompiled Upstart as usual until you're sure the leak has occured.

  • Look through the logs for mismatch malloceds and addrs.

A couple of notes:

  • The --wrap=symbol feature does not work with function names that are actually macros. So watch out for #define malloc nih_malloc. The this is what libnih does you'd need to use --wrap=nih_malloc and __wrap_nih_malloc instead.
  • Use gcc's builtin backtracing features.
  • All of these changes only affect the recompiled Upstart executable.
  • You could dump the logs to an sqlite DB instead with may make it easier to find mismatch mallocs and frees.
  • you can make you log format an SQL insert statement then just insert them into a database post-mortem for further analysis.

OTHER TIPS

You can also use init unchanged, but create a wrapper which sets the MALLOC_CHECK environment variable to 1 or higher. This will let you see some memory allocation diagnostics.

A variation is to change init source code slightly to set that environment variable itself early before it starts using malloc.

You can also as AmineK suggested add debug code to the init source code itself.

You can instrument your memory allocation yourself by hooking malloc/free calls, and counting the number of bytes you allocate and you free each time.

You could try linking your version of upstart with Google's TCMalloc. It comes with a builtin heap checker.

The heap checker can be enabled in two ways:

  • set the environment variable HEAPCHECK to one of { normal | strict | draconian }.
  • set HEAPCHECK to local and check code by hand with HeapProfileLeakChecker objects.

I don't know how to set an environment variable for init however.

How about running pmap on the process and examining what memory segments are growing. That may give you some idea of what is eating memory. A little scripting could make this process almost automatic**.

** In a past life, I actually wrote a script that would take n pmap snapshots of a set of running processes spaced t seconds apart. The output of that was fed into a perl script that identified segments that changed their size. I used it to locate several memory leaks in some commercial code. [I would share the scripts, but they are covered under IP (copyright) of a previous employer.]

  • John
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top