What follows is specific to GNU's OpenMP runtime library libgomp
, but other OpenMP runtimes do more or less the same. All links lead into the relevant parts of the GCC 4.8.2 source code.
The actual implementation of omp_get_thread_num()
is very simple:
int
omp_get_thread_num (void)
{
return gomp_thread ()->ts.team_id;
}
Each thread has a thread-local pointer to an instance of struct gomp_thread
. Also each thread is part of a team of threads and its state inside the team is represented by the ts
member of type struct gomp_thread_team_state
. The latter contains an unsigned int
member called team_id
which gives the ID of the thread in its team.
The way the thread locates its own instance of struct gomp_thread
depends on whether the platform has TLS (thread-local storage) or not. In the latter case the TLS is emulated by the POSIX threads library. Both implementations are: (taken from here)
with TLS:
extern __thread struct gomp_thread gomp_tls_data;
static inline struct gomp_thread *gomp_thread (void)
{
return &gomp_tls_data;
}
The __thread
keyword makes the global variable gomp_tls_data
thread-local, which means that each thread gets its own copy of it.
without TLS:
extern pthread_key_t gomp_tls_key;
static inline struct gomp_thread *gomp_thread (void)
{
return pthread_getspecific (gomp_tls_key);
}
In that case the pthread_getspecific()
is used to obtain a thread-local copy of the structure instance (located here).
Though it is possible that pthread_getspecific()
also uses the TLS to store the values and one might wonder why two different implementations are provided, the answer is that directly accessing the TLS could be faster than calling the Pthreads API functions. In that respect OS X is a peculiar case - the Mach-O executable format does not provide a GNU compatible TLS implementation and some people have reported that there the Pthreads API is actually faster than the emulated GNU TLS.