Difference between linking OpenMP with -fopenmp and -lgomp

Question

OpenMP is an intermediary between your code and its execution. Each #pragma omp statement are converted to calls to their according OpenMP library function, and it's all there is to it. The multithreaded execution (launching threads, joining and synchronizing them, etc.) is always handled by the Operating System (OS). All OpenMP does is handling these low-level OS-dependent threading calls for us portably in a short and sweet interface.

The -fopenmp flag is a high-level one that does more than include GCC's OpenMP implementation (gomp). This gomp library will require more libraries to access the threading functionality of the OS. On POSIX-compliant OSes, OpenMP is usually based on pthread, which needs to be linked. It may also need the realtime extension library (librt) to work on some OSes, while not on some other. When using dynamic linking, everything should be discovered automatically, but when you specified -static, I think you've fallen in the situation described by Jakub Jelinek here. But nowadays, pthread (and rt if needed) should be automatically linked when -static is used.

Aside from linking dependencies, the -fopenmp flag also activates some pragma statement processing. You can see throughout the GCC code (as here and here) that without the -fopenmp flag (which isn't trigged by only linking the gomp library), multiple pragmas won't be converted to the appropriate OpenMP function call. I just tried with some example code, and both -lgomp and -fopenmp produce a working executable that links against the same libraries. The only difference in my simple example that the -fopenmp has a symbol that the -lgomp doesn't have: GOMP_parallel@@GOMP_4.0+ (code here) which is the function that initializes the parallel section performing the forks requested by the #pragma omp parallel in my example code. Thus, the -lgomp version did not translate the pragma to a call to GCC's OpenMP implementation. Both produced a working executable, but only the -fopenmp flag produced a parallel executable in this case.

To wrap up, -fopenmp is needed for GCC to process all the OpenMP pragmas. Without it, your parallel sections won't fork any thread, which could wreak havoc depending on the assumptions on which your inner code was done.