It is possible to do it using thread-private variables. Those persist across subsequent parallel
regions:
void func(...)
{
static double *buf;
#pragma omp threadprivate(buf)
#pragma omp parallel num_threads(nth)
{
buf = malloc(n * sizeof(double));
...
}
#pragma omp parallel num_threads(nth)
{
// Access buf here - it is still allocated
}
#pragma omp parallel num_threads(nth)
{
// Free the memory in the last parallel region
free(buf);
}
}
There are several key points to notice here. First, the number of threads that allocate buf
should match the number of threads that deallocate it. Also, if there are parallel regions in between and they execute with larger teams, buf
might not be allocated in all of them. Therefore it is advisable to either disable the dynamic team size feature of OpenMP or to simply use the num_threads
clause as shown above to fix the number of threads for each parallel region.
Second, local variables can be made thread-private only if they are static. Therefore, this method is not suitable for use in recursive functions.
The code should compile and work as expected even if OpenMP support is disabled.