Question

I was investigating a severe performance hit in UNIXBENCH's spawn.c portion of it's benchmark which measures process creation speed:

https://code.google.com/p/byte-unixbench/source/browse/trunk/UnixBench/src/spawn.c

I could not understand why when running under centos I was getting very low numbers (even stalling or halting of the process) and then if I temporarily booted into debian, the performance was exponentially higher.

I eventually tracked it down to the fact I was preloading jemalloc 3.6 via /etc/ld.so.preload which is a replacement high performance memory allocator:

https://www.facebook.com/notes/facebook-engineering/scalable-memory-allocation-using-jemalloc/480222803919

Is this performance hit because every spawned process is loading it's own copy of jemalloc?

Is there a way to avoid that and still have jemalloc auto-load? Why doesn't it share the library?

Was it helpful?

Solution

Does jemalloc have any other dependencies (like pthread)? If so the additional load time cost might add up, and in the case of pthread, it might cause some functions that could be lock-free in single-threaded applications to actually take locks, slowing them down. In any case, even just mapping an additional library into the process's address space and performing relocations takes a significant amount of time, so if the program being timed is minimal (I can't tell from your link exactly what's being timed) then execution time might be dominated by the work the dynamic linker does.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top