Use google perftools.
You can link your program or even LD_PRELOAD the library in and it will profile your heap use generating snapshots, it won't take much of your performance out, when you see that the heap is already too big you can stop it and get a graph of where the memory is spent.
EDIT: tutorial here
Example: