The problem was quite hard to find as it consisted of two entirely independent components:
My executable was compiled with -fomit-frame-pointer. This will result in a reset of the limit. See the following example:
/* rlimit.cpp */ #include <iostream> #include <thread> #include <vector> #include <sys/resource.h> class A { public: void foo() { struct rlimit limit; getrlimit(RLIMIT_AS, &limit); std::cout << "Limit: " << limit.rlim_cur << std::endl; } }; int main() { struct rlimit limit; limit.rlim_cur = 500 * 1024 * 1024; setrlimit(RLIMIT_AS, &limit); std::cout << "Limit: " << limit.rlim_cur << std::endl; std::vector<std::thread> t; for(int i = 0; i < 5; i++) { A a; t.push_back(std::thread(&A::foo, &a)); } for(auto thread : t) thread.join(); return 0; }
Outputs:
> g++ -std=c++11 -pthread -fomit-frame-pointer rlimit.cpp -o limit > ./limit Limit: 524288000 Limit: 18446744073709551615 Limit: 18446744073709551615 Limit: 18446744073709551615 Limit: 18446744073709551615 Limit: 18446744073709551615 > g++ -std=c++11 -pthread rlimit.cpp -o limit > ./limit Limit: 524288000 Limit: 524288000 Limit: 524288000 Limit: 524288000 Limit: 524288000 Limit: 524288000
For the image processing part I work with OpenCL. Apparently NVIDIA's implementation calls setrlimit and pushes the limit to rlim_max.