Nested parallelism must be explicitly set, as it is disabled by default in most implementations. Standing to the OpenMP 4.0 standard, you must set the OMP_NESTED
environment variable:
The OMP_NESTED environment variable controls nested parallelism by setting the initial value of the nest-var ICV. The value of this environment variable must be true or false. If the environment variable is set to true, nested parallelism is enabled; if set to false, nested parallelism is disabled. The behavior of the program is implementation defined if the value of OMP_NESTED is neither true nor false.
The following line should work for bash:
export OMP_NESTED=true
Futhermore, as noted by @HristoIliev in the comment below, it's very likely that you want to set the OMP_NUM_THREADS
environment variable to tune performance. Quoting the standard:
The value of this environment variable must be a list of positive integer values. The values of the list set the number of threads to use for parallel regions at the corresponding nested levels.
This means that one should set the value of OMP_NUM_THREADS
similar to n,n-1
where n
is the number of CPU cores. For instance:
export OMP_NUM_THREADS=8,7
for an 8-core system (example copied from the comment below).